Build block data clean

experimental design R

A short description of the post.

Jixing Liu https://jixing475.github.io/jixingBlog (Centre for Artificial Intelligence Driven Drug Discovery)https://www.mpu.edu.mo/esca/en/aidd.php
2021-02-03
library(tidyverse)
data_path_ACEMOL <- here::here("analysis/data/raw_data/Build_block_ACEMOL.csv")

ACEMOL

df <- read_csv(data_path_ACEMOL)

df %>% head()
#> # A tibble: 6 × 9
#>   标题   标题链接  CatalogNo. CasNo. Name  ReaxyNo. CNName `cards-box`
#>   <chr>  <chr>     <chr>      <chr>  <chr> <chr>    <chr>  <chr>      
#> 1 >(2S)… http://w… Catalog N… Cas N… >(2S… Reaxy N… CN Na… 216237-52-4
#> 2 >1-Am… http://w… Catalog N… Cas N… >1-A… Reaxy N… CN Na… 22059-21-8 
#> 3 >1-(B… http://w… Catalog N… Cas N… >1-(… Reaxy N… CN Na… 88950-64-5 
#> 4 >Meth… http://w… Catalog N… Cas N… >Met… Reaxy N… CN Na… 72784-43-1 
#> 5 >Meth… http://w… Catalog N… Cas N… >Met… Reaxy N… CN Na… 72784-42-0 
#> 6 >ethy… http://w… Catalog N… Cas N… >eth… Reaxy N… CN Na… 42303-42-4 
#> # … with 1 more variable: cards-box-desc1 <chr>
df %>% colnames()
#> [1] "标题"            "标题链接"        "CatalogNo."     
#> [4] "CasNo."          "Name"            "ReaxyNo."       
#> [7] "CNName"          "cards-box"       "cards-box-desc1"

df_clean <- 
df %>%
  select(name = Name,
         name_zh = CNName,
         SMILES = `cards-box-desc1`) %>%
  mutate(name_zh = str_remove(name_zh, "CN Name:"),
         name = str_remove(name, "^>"))



library(docking)
docking::init_py()

df_clean %>% 
  pull(SMILES) %>% 
  map_chr(~ py$smiles_to_canonical(.x))

is.na(.Last.value) %>% sum()



# df_clean %>% 
#   py$df_add_ROMol() %>% 
#   rename(ID = name) %>% 
#   open_in_DataWarrior()

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/jixing475/jixing.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Liu (2021, Feb. 3). Jixing Liu: Build block data clean. Retrieved from https://jixing.org/posts/2022-02-04-buildblock/

BibTeX citation

@misc{liu2021build,
  author = {Liu, Jixing},
  title = {Jixing Liu: Build block data clean},
  url = {https://jixing.org/posts/2022-02-04-buildblock/},
  year = {2021}
}