#RStats
CRAN updates: bslib cequre fastAFT gmvarkit gpboost skilljaR #rstats
New CRAN package robotoolbox with initial version 1.3.2
#rstats
https://cran.r-project.org/package=robotoolbox
New CRAN package fritools2 with initial version 4.1.0
#rstats
https://cran.r-project.org/package=fritools2
New CRAN package cereal with initial version 0.1.0
#rstats
https://cran.r-project.org/package=cereal
@milesmcbain There’s also @brodriguesco 's Workshop for Ukraine on reproducible pipelines in R on June 29 that includes using {targets}, as well as {renv} and Docker #rstats
https://sites.google.com/view/dariia-mykhailyshyna/main/r-workshops-for-ukraine#h.i3fjt5lw8dyo
CRAN updates: GJRM #rstats
In case this is useful to anyone else, here's an #rstats function I keep needing:
#' List to Text
#'
#' Convert a list or vector to text with human-readable separators, e.g., "A, B & C".
#'
#' @param x The list or vector to convert
#' @param comma The text to use to separate all but the last item
#' @param and The text to use to separate the last item
#'
#' @return A character string
#' @export
#'
#' @examples
#' comma_and(LETTERS[1:5])
#' comma_and(LETTERS[1:5], and = " and ")
#' comma_and(LETTERS[1:5], comma = "; ")
#'
#' # change and to use an oxford comma
#' my_list <- list("Nelson Mandela",
#' "an 800-year-old demigod",
#' "a dildo collector")
#' comma_and(my_list) # probably not what you mean
#' comma_and(my_list, and = ", and ")
comma_and <- function(x, comma = ", ", and = " & ") {
if (length(x) == 1) {
txt <- x
} else {
last <- x[length(x)]
first <- paste(x[1:(length(x)-1)], collapse = comma)
txt <- paste0(first, and, last)
}
return(txt)
}
CRAN updates: survstan #rstats
New CRAN package GGoutlieR with initial version 1.0.0
#rstats
https://cran.r-project.org/package=GGoutlieR
Customize what happens when you start R: https://henrikbengtsson.github.io/startup/ #rstats #environment
CRAN updates: kohonen saeHB.ME.beta #rstats
New CRAN package FuzzySimRes with initial version 0.1.2
#rstats
https://cran.r-project.org/package=FuzzySimRes
New CRAN package epiworldR with initial version 0.0-1
#rstats
https://cran.r-project.org/package=epiworldR
New CRAN package cvmaPLFAM with initial version 0.1.0
#rstats
https://cran.r-project.org/package=cvmaPLFAM
CRAN updates: SIRthresholded #rstats
I forgot to post the hex logo for {carelesswhisper} - so here it is.
Andrew and George.

New #rstats package tinytest2JUnit exports tinytest results for integration with CI/CD workflow. Thanks to @openanalytics 👍
CRAN removals: BMRSr GapAnalysis multilevelcoda RMixtComp spatPomp #rstats
CRAN updates: microeco #rstats
CRAN updates: lrstat #rstats
CRAN updates: pm3 RESI snvecR #rstats
In all likelihood what they forgot, and continue to forget, to teach you about R is {targets}.
But am I heartened that there are 2 talks on the bill for #positconf2023 citing the most impactful #rstats tool since ggplot! Bravo.
The {careless whisper} R pkg ships with just the tiniest model. This is fast, and good for proof-of-concept speech-to-text.
If you're interested in good speech-to-text, then please download & try one of the larger whisper models.
They range up to 4GB in size, but even some of the medium sized models are very good.
Notes:
* If you're only interested in the English language, then use one of the "english only" models.
* Larger models are slower!
Models: https://ggml.ggerganov.com/
Am I the only #rstats gt user that constantly wants the location rows selectors to operate as expressions that are evaluated on each selected column independently, ie ‘~ .x < 50’ to select cells less than 50 in each column separately?
📦 tidyfst
📝 Tidy Verbs for Fast Data Manipulation
🔗 https://cran.r-project.org/web/packages/tidyfst/index.html
CRAN updates: boinet grates #rstats
While sin is a made-up concept, the thing Christians say about sin is 100% true when applied to working with dates in #datascience:
"Sin will take you farther than you want to go, keep you longer than you want to stay, and cost you more than you want to pay."
CRAN updates: admiral api2lm BFS crossval PaRe RcppJagger #rstats
CRAN updates: altdoc #rstats
@rmflight miniaudio has facilities for WAV and mp3 decoding, so it's conceivable that you could do something like this.
ctx <- whisper_init()
whisper_stream(ctx, "sound_file.mp3")
And this would internally decode/tranlsate on the fly (without decoding the audio into a single gigantic sound file)
CRAN updates: historicalborrowlong #rstats
📢 I am going to teach a new two-part workshop series in collaboration with Pearson × OReilly Media:
"Hands-On Data Visualization with ggplot2" 📈👩💻🧑💻
Follow the first session "Concepts" next Tuesday
📅 June 13, 12pm-3pm EST
👉 https://www.oreilly.com/live-events/hands-on-data-visualization-with-ggplot2-concepts/0636920089879/
#rstats #tidyverse #ggplot2 #dataviz #datavis #datavisualization
CRAN updates: gifski #rstats
New CRAN package texor with initial version 1.0.1
#rstats
https://cran.r-project.org/package=texor
New CRAN package ProFAST with initial version 1.2
#rstats
https://cran.r-project.org/package=ProFAST
Because procrastination and because knitr makes it dead easy (yay #RStats!), I rendered a gitbook version of my second paper:
https://eliocamp.github.io/publication/shceof/gitbook/abstract.html
What is your favorite movie about statistics?

Now on CRAN! {ami} (“am I?”) is a unified collection of lightweight checks that can be used to better understand the environments in which your #rstats code is running.

📢Calling all biologists and ecologists!
Join our Introduction to Statistics in R course with @PhilipLeftwich in July. Unlock the mysteries of R & the world of statistical analysis, gaining confidence in your skills!! https://physalia-courses.org/courses-workshops/course13/ 📊
#Statistics #Rstats #DataScience

The {airnow} #rstats 📦 lets you query & retrieve air quality information from the US government's AirNow API. “Current and historical readings as well as forecasts can be retrieved as tidy data frames.” By @briandconnelly . On CRAN.
https://briandconnelly.github.io/airnow/
Of special interest to those of us in the Northeast US at the moment!
#AirQuality #WildFires @rstats


Europe's police emergency numbers for this week's #MapPromptMonday, Safety
Code: https://github.com/gkaramanis/mappromptmonday/tree/master/2023/2023-week_23
Source: https://en.wikipedia.org/wiki/List_of_emergency_telephone_numbers

Note: I've only tested on macOS, and would be very interested to hear if this works on other platforms.
Introducing {carelesswhisper} - automatic speech recognition in #RStats using whisper.cpp
Attached vid: Live capture of my R session: recording 2 seconds of audio and translating to text.
This app includes the smallest (70MB) multi-language model. It can translate other languages to English too!
Also includes built-in audio recording code based upon miniaudio.
Pkg should work out-of-the box. No dependencies. (in theory!)

When you check in on your #rstats package again and the code looks like an unwanted time capsule? Do some spring cleaning! The tidy team does it every year and because not everyone enjoys it as much as I do, we have a few things to make it more fun: @andyteucher wrote it all up, from dedicated time to helpful checklists (usethis::use_upkeep_issue() 😍 ) to tongue-in-cheek celebratory certificates at the end.
https://www.tidyverse.org/blog/2023/06/spring-cleaning-2023/
I drafted a new vignette on "intersectionality analysis", using the MAIHDA framework with mixed models and {ggeffects}. The vignette not only shows how to quantify variation (inequalities), but also how to compare different groups at risk
https://strengejacke.github.io/ggeffects/articles/practical_intersectionality.html
Intersectionality analysis is a new approach in social epidemiology, which attempts to move away from looking at relevant social indicators in isolation and rather looks at effects of belonging to specific strata simultaneously. #rstats
For the first time in what feels like forever, I will live-stream @rstats Shiny app development as part of R/Medicine 2023! I'll create a brand-new #shiny app to interactively explore a licorice and gargling clinical study, powered by:
🦏 {rhino} framework @appsilon
🔍 Drill-down summary tables using {Tplyr} and {reactable}
⭐️ New dashboard capabilities from {bslib} @cpsievert @Posit
Come by and say hello!
📆 June 8th 1 PM EST / 7 PM CEST
📡 https://twitch.tv/rpodcast

Code review is an important part of software development!
@davis shares how #tidyverse made our process explicit in a code review principles guide.
Feel free to modify these principles for your own needs & we’d love to hear about it if you do!
https://www.tidyverse.org/blog/2023/06/code-review-principles/
Spring Cleaning applies not only to our homes, but also to the code that we maintain. 🌸
@andyteucher shares how the #tidyverse team tackles this together & shows a new feature in {usethis} that will help you organize your own Spring Cleaning!
https://www.tidyverse.org/blog/2023/06/spring-cleaning-2023/
Imagine #RStats blogging as a hobby, lol. Anyway, here's two more posts.
👮 Rectangularise Word tables extracted by {officer}: https://www.rostrum.blog/2023/06/07/rectangular-officer/
📊 Recreating a dataviz with #ggplot2: https://www.rostrum.blog/2023/05/10/spear-ggplot2/


“The {datawizard} #rstats 📦 (from the easystats ecosystem) has two very useful functions to deal with duplicates:
* data_duplicated: Extract all duplicates including the first, unlike duplicated() or dplyr::distinct()
* data_unique: by default selects the ‘best’ duplicate” - Rémi Thériault
https://easystats.github.io/datawizard/reference/data_duplicated.html
https://easystats.github.io/datawizard/reference/data_unique.html


Dear #RStats community,
I'm preparing a presentation on #R and #Wikipedia. Do you know of any other interesting packages besides WikipediR, WikidataR and GlittR?
📌
@rladiesrome
is hosting two events this month:
1.- Data Science Best Practices
Speaker: Dr. Simina Boca
When: June 12, 2023 at 6.00 PM CET / 12.00 PM EDT
RSVP: https://bit.ly/444pL6X
2.- One Health and the Politics of Coronaviruses
Speaker: Dr. Laura Kahn
When: June 30, 2023 at 5.00 PM CET / 11.00 PM EDT
RSVP: https://bit.ly/43MM3Kg

The latest Excel blunder from Austria is a lesson in why we need professional data people.
There's a widespread expectation that anyone can take Excel and use it to do critical things with data.
But data people know that we need the right tools to make data processing verifiable (usually with code) and enable us to check that everything is as it should be (with unit tests or assertions).
And most of all, we need more #dataliteracy at every level.
https://www.theregister.com/2023/06/06/austria_election_excel_blunder/
In which I argue that inheritance is backwards for statistical methods objects #rstats
https://notstatschat.rbind.io/2023/06/07/blank-cheque-inheritance-and-statistical-objects/
“The {marginaleffects} 📦 book is now online! 25 chapters on post-estimation analyses and interpretation with #Rstats. The 📖 is full of tutorials, case studies, tips, and technical notes. Please check it out and let us know how we can improve this resource” - @vincentab
https://vincentarelbundock.github.io/marginaleffects/
This GitHub repo has a number of interactive Shiny apps for "self-discovery of statistical concepts and rules-of-thumb." By Devan Becker
https://github.com/DBecker7/DB7_TeachingApps
#rstats #RShiny #statistics @rstats


>Rsearcher finds factors linked with chronic school absenteeism
Is an Rsearcher a researcher who uses #Rstats?
https://phys.org/news/2023-06-rsearcher-factors-linked-chronic-school.html
Rediscovering once again the thing where, if you want to generate a series of identically formatted ggplots with different data subsets you have to wrap the plot in `print()` to write to pdf within the `for()` loop #Rstats
Keeping it simple and minimal with a slope chart for #TidyTuesday this week! Using data from @ourworldindata, I focused in on how coal production per capita has changed in different countries since 1960 🔥
Code: https://github.com/nrennie/tidytuesday/tree/main/2023/2023-06-06

#RStats tip: I only recently learned about `Hmisc::smean.sd` (and friends), and their wrappers in `ggplot2::hmisc`, such as `ggplot2::mean_cl_boot()`. I'm still learning new ways to make {ggplot2} do my data munging for me.
📦 usethis v2.2.0 is out! 📦
The theme of this release is "a year of miscellaneous maintenance" 😅 It's also the version most closely tied to the upcoming print publication of R Packages 2e. Finally, we welcome @andyteucher as a new author! #rstats
I'm brainstorming for #RStats @rOpenSci Coworking themes to use in the future.
Some examples of things we've already done...
- Start Writing that Package!
- Getting Started with targets!
- Working with New R Users
- Setting Up Continuous Integration
- Checking Data with naniar, visdat, assertr, and skimr!
- Working with Taxonomic Lists
Any suggestions for future themes? Anything you might like to see?
🚀 [Blog post] Meeting the stars of the R-Universe. This month interview: an Open Source Project to Take Care of the Planet.
We learn about the PEcAn project, where they develop open source tools and models for climate change.
✍️ https://ropensci.org/blog/2023/06/06/r-universe-stars-4-en/
From the inbox: How can I get fold assignments from spatialsample?
https://www.mm218.dev/posts/2023-06-06-spatialsample_splits/
Having learned programming mostly in #rstats, I realize that I have a very fuzzy mental model of what "compiling" code even means. Can someone point me to an explanation of what it means to "compile" or "build from source" for semi- or non-experts?
Unleash the power of reproducible workflows in R with @eliocamp and @paocorrales ! 🧩 Join our course to master collaboration, version control, and seamless document generation. Empower your data science journey with efficiency and reliability! 🚀💡
https://physalia-courses.org/courses-workshops/r-reproducibility/

The {marginaleffects} 📦 book is now online! 25 chapters on post-estimation analyses and interpretation with #Rstats. The 📖 is full of tutorials, case studies, tips, and technical notes. Please check it out and let us know how we can improve this resource vincentarelbundock.github.io/marginaleffects

I'm very pleased to announce the next speaker of the #RLadiesCambridge #dataviz lunch is @tanya_shapiro Tanya, an independent data consultant, will talk about interactive data visualisation with {ggiraph}. Join us on the 23rd of June at 12.30 pm BST/ 7.30 am EDT. #rstats
https://www.meetup.com/rladies-cambridge/events/293991363/
New blog post! Three useful (to me) #RStats patterns
☑️ utils::modifyList()
☑️ rlang::%||%
☑️ Base R Set Operations
📦 tidyEmoji
📝 Discovers Emoji from Text
🔗 https://cran.r-project.org/web/packages/tidyEmoji/index.html
Starting to mix #tidymodels with #targets for the first time in #rstats... I'm interested to hear tips from anyone who's trodden this path before.
It seems that the tidymodels idea of making a large specification which is evaluated late in one large computation is kind of at odds with the value targets brings to caching intermediate steps?
“The often-overlooked do.call() #rstats function is a powerful tool that allows you to dynamically call other functions, opening up a world of possibilities for code organization, reusability, and flexibility.” - @stevensanderson
https://www.spsanderson.com/steveondata/posts/2023-06-01/index.html
💜 @rladiesrome@bird.makeup is hosting:
👉 Data Science Best Practices
Speaker: @siminaboca
When: June 12th, 2023 - 6:00 PM CET / 12:00 PM EDT
RSVP: 🔗 https://www.meetup.com/rladies-rome/events/293609269/
#rstats #datascience #health #rladies #womenempowerment #onlinelearning #opensource
Verified oldest people for this week's #TidyTuesday. Big thanks to Aryn Toombs for the tutorial on making a beeswarm chart with circle packing https://aryntoombs.github.io/tutorials/beeswarm.html
code: https://github.com/gkaramanis/tidytuesday/tree/master/2023/2023-week_22

⚠️ rgdal, rgeos, and maptools won’t be available on CRAN after October 2023. ⚠️
What are the consequences of this change? What do you need to do as a user and as a developer?
Read my blog post at https://geocompx.org/post/2023/rgdal-retirement/

Sea level change for this week's #MapPromptMonday, Cimate Change Vulnerability
Code: https://github.com/gkaramanis/mappromptmonday/tree/master/2023/2023-week_22

Video: Intro to Version Control using git and RStudio - presentation by Ryan Johnson, data science advisor at @Posit, at @NHSrCommunity
https://youtu.be/qNMOPWT8jSo

I still haven’t seen anything to disprove the best description of R vs Python for data/stats that I’ve seen:
Python is an elegant, well-designed language with a confusing, oddly designed data DSL bolted onto it & R is an elegant, well designed data DSL with a confusing, oddly designed programming language built around it. #rstats
Is there a web API that you'd love to use with R, if only it weren't so painful? Or perhaps you're using one, but aren't sure if it would make sense as a package. Please let me know here! https://forms.gle/CJz12TzzHkGsnQma9
I often see really bad statistical errors in #python analyses that I don't see in #rstats. I don't think it's necessarily an issue with the language as much as it is the population using it. People with a CS degree and little statistics background often use python, and statisticians often use R.
While I know it's a generalization, I can't help but see python as a sort of anti-shibboleth for good statistical analysis.
Remember @yabellini's plan to translate the whole internet, I mean, @rOpenSci multilingual publishing project? https://ropensci.org/multilingual-publishing/
You might find these #RStats packages useful:
🌐 {babeldown} https://docs.ropensci.org/babeldown/ for translating Markdown-based content via DeepL API;
🌐 {babelquarto} https://docs.ropensci.org/babelquarto/ for rendering multilingual Quarto books.
Feedback welcome! They're still rather experimental but we do use them. 🧪
{sf} is an S-tier #rstats package
New blog post! We're about to embark on an 18-state, 5,000+ mile road trip (😱), so I figured I'd visualize it with #rstats and {sf}! Here's how to automatically get geocoded location and routing(!) data from OpenStreetMap and make fancy maps with ggplot #dataviz #rspatial https://www.andrewheiss.com/blog/2023/06/01/geocoding-routing-openstreetmap-r/




that's not actually the same as a "wide" format, and in any case wide and long are most times perfectly interchangeable (see melt and dcast in #rstats)
in particular, wide format are quite easy at least to the eyes when treating time series data
@MagicTony @Bakeri666 @JASPStats
The software dictates the format, reshapes it as appropriate, and then feeds it into the analysis. For example, in #rstats, the afex library accepts long format data, aggregates any replicates, and runs an anova.
The bigger point is that a tool designed to alleviate the programming requirement of analyses shouldn't require major preprocessing like reshaping data.
Ooo the {gt} #rstats 📦's new interactive option includes resizable columns with the op_interactive(use_resizers = TRUE ) argument.
Only thing missing for me in this early version is regular expression searching! (I do love that in {dt})
Looking forward to trying this:
https://posit.co/blog/new-in-gt-0-9-0-interactive-tables/
The glossary #rstats package is now on CRAN!
Glossary is a lightweight solution for making glossaries in educational materials written in quarto or R Markdown. This package provides functions to link terms in text to their definitions in an external glossary file, as well as create a glossary table of all linked terms at the end of a section.