stringr::str_split(string, ",(?! )", simplify = TRUE)
stringr::str_split(string, ",(?! )", simplify = TRUE)
"Required Packages
library(quanteda)
library(quanteda.textstats)
library(quanteda.textplots)
library(readr)
library(dplyr)
library(ggplot2)
library(stringr)
library(DT)
library(tidytext)
Understanding Text Analytics Fundamentals
Text analytic...
Co..."
"Required Packages
library(quanteda)
library(quanteda.textstats)
library(quanteda.textplots)
library(readr)
library(dplyr)
library(ggplot2)
library(stringr)
library(DT)
library(tidytext)
Understanding Text Analytics Fundamentals
Text analytic...
Co..."
stringr.tidyverse.org/articles/fro...
stringr.tidyverse.org/articles/fro...
stringr::str_sort(x, numeric = TRUE)
[1] "1" "8" "9A" "10" "21A" "21B" "40"
stringr::str_sort(x, numeric = TRUE)
[1] "1" "8" "9A" "10" "21A" "21B" "40"
x <- c("8", "10", "1", "40")
str_sort(x, numeric = TRUE)
[1] "1" "8" "10" "40"
x <- c("8", "10", "1", "40")
str_sort(x, numeric = TRUE)
[1] "1" "8" "10" "40"
Key takeaways:
awk = lightweight, precise
csvtk = robust, CSV-aware
stringr = complex logic in R
Always validate after replacing. Mistakes here can cost you a week.
Key takeaways:
awk = lightweight, precise
csvtk = robust, CSV-aware
stringr = complex logic in R
Always validate after replacing. Mistakes here can cost you a week.
Prefer R?
For more complex regex:
library(stringr)
df$V5 <- str_replace(df$V5, "pattern", "replacement")
Want to go further? Use str_replace_all() or mutate() from dplyr.
Prefer R?
For more complex regex:
library(stringr)
df$V5 <- str_replace(df$V5, "pattern", "replacement")
Want to go further? Use str_replace_all() or mutate() from dplyr.
con <- DBI::dbConnect(duckdb::duckdb())
dbplyr::translate_sql(stringr::str_trim(x), con = con)
#>
con <- DBI::dbConnect(duckdb::duckdb())
dbplyr::translate_sql(stringr::str_trim(x), con = con)
#>
Also thank you, I did not know about this function
Also thank you, I did not know about this function
Key takeaways:
• Regex cleans messy data fast
• Works on gene IDs, labels, metadata
• Combine with stringr for clarity
Key takeaways:
• Regex cleans messy data fast
• Works on gene IDs, labels, metadata
• Combine with stringr for clarity
cghlewis.github.io/data-wrangli...
cghlewis.github.io/data-wrangli...
"The WiRe" about trying to catch {stringr} Bell and tidyverse-ing up BaltimoRe
"The WiRe" about trying to catch {stringr} Bell and tidyverse-ing up BaltimoRe