You are here |
timogrossenbacher.ch | ||
| | | |
www.r-statistics.com
|
|
| | | | Guest post by Jake Russ For a recent project I needed to make a simple sum calculation on a rather large data frame (0.8 GB, 4+ million rows, and ~80,000 groups). As an avid user of Hadley Wickham's packages, my first thought was to use plyr. However, the job took plyr roughly 13 hours to complete. plyr is extremely efficient | |
| | | |
tdhock.github.io
|
|
| | | | Statistical machine learning researcher working on fast optimization algorithms for large data. | |
| | | |
www.r-spatial.org
|
|
| | | | Using data.table and Rcpp to scale geo-spatial analysis with sfview raw RmdThe backgroundAt the beginning of 2017 I left academia to wor... | |
| | | |
www.rdatagen.net
|
|
| | Simulation can be super helpful for estimating power or sample size requirements when the study design is complex. This approach has some advantages over an analytic one (i.e.one based on a formula), particularly the flexibility it affords in setting up the specific assumptions in the planned study, such as time trends, patterns of missingness, or effects of different levels of clustering. A downside is certainly the complexity of writing the code as well as the computation time, which can be a bit painful. My goal here is to show that at least writing the code need not be overwhelming. |