Python4DataScience
@python4data.science
4.5K followers 4 following 69 posts
Teaching materials for the cusy training courses on a Python-based data science workflow: https://cusy.io/en/seminars
Posts Media Videos Starter Packs
python4data.science
Since we have recently been asked frequently whether pandas is slow and whether we should use Polars, Dask or DuckDB instead, we have now provided an initial overview of the various technologies: www.python4data.science/en/latest/wo...
#Python #Performance #DuckDB
pandas
pandas is a Python library for data analysis that has become very popular in recent years. On the website, pandas is described thus: „pandas is a fast, powerful, flexible and easy to use open sourc...
www.python4data.science
Reposted by Python4DataScience
spackpm.bsky.social
💥Spack v1.0 is out!💥

This is a huge milestone. We reworked the core to add compiler dependencies, and we're introducing a stable package API.

🚀1.0 also adds concurrent builds, better includes, and much more -- read it all in the release notes!

github.com/spack/spack/...
github.com
python4data.science
The XKCD comic on reproducible scientific results fits perfectly with our tutorial 🧐 😉
www.python4data.science/en/latest/pr...
XKCD #3117: Replication Crisis
python4data.science
Almost more significant than the success of #Python is the growth of #Jupyter #Notebooks: “Data scientists and machine learning researchers commonly use the #OpenSource application for #MachineLearning, #DataViz, and more.”
jupyter-tutorial.readthedocs.io/en/latest/in...
Graph from GitHub’s Octoverse 2024 report showing a spike in utilization of Jupyter Notebooks across GitHub. This is calculated by looking at the distinct number of public repositories with at least one Jupyter Notebook by the year the repository was created. Since 2016, we have seen this number surge from near zero to more than 1.5 million repositories using Jupyter Notebooks.
python4data.science
We have expanded our section on GitLab CI/CD pipelines with examples of
• GitLab Pages
• npm deployments with rsync
• building Docker containers
• multi-arch images with Buildah
• migrating GitHub Actions
www.python4data.science/en/latest/pr...
#GitLab #CICD #DevOps #DX
GitLab CI/CD
GitLab CI/CD can automatically build, test, deploy and monitor your applications during iterative code changes. This reduces the risk that you will develop new code based on buggy previous versions...
www.python4data.science
python4data.science
🎉 4000 Pythonistas and data scientists now follow us on Bluesky 🤗 We are very pleased about the great interest in our offer.
#Python #DataScience
python4data.science
Our course for the versioned and reproducible storage of code and data in data science workflows is now also referenced in the official Git documentation: git-scm.com/doc/ext
#Git #DataScience #DX
Git - External Links
git-scm.com