Lightnews — Scholar-powered news

@tinztwinshub.bsky.social

🤯 Wow, it's approximately 7000 times faster!

Large Pandas Dataframes can consume a large amount of memory. It's fascinating how processing data in smaller chunks can help prevent running out of memory and access data faster!

Are you looking for more Python tips?👇🏽
tinztwinshub.com/data-science...

December 1, 2025 at 5:43 PM

żwirek

@1zwirek1.bsky.social

naprawiłam skurwysyna I NAPRAWIŁAM DUBEL ALE NIE WIEDZIAŁAM ŻE MOŻNA TAK ROBIĆ Z DATAFRAMES I TERA ŚMIGA SUKOOOOOO

November 27, 2025 at 8:45 PM

rogama ❄️ @globitos.bsky.social 🏳️‍🌈

@rogama25.es

Yo estoy haciendo una historia de extraer logs de kibana (elastic search) y pintar gráficos y hacer dataframes

November 27, 2025 at 11:38 AM

JauntyWunderKind

@jauntywk.bsky.social

ideally webgpu would also somehow be a target, in the hypothetical/ideal web-capable dataframe universe, but i don't see any projects at all doing webgpu ⨯ dataframes, much less building a query engine etc for it.

November 25, 2025 at 2:07 AM

RustTrending

@rusttrending.bsky.social

pola-rs / polars: Extremely fast Query Engine for DataFrames, written in Rust ★36211 https://github.com/pola-rs/polars

pola-rs / polars

Extremely fast Query Engine for DataFrames, written in Rust

github.com

November 24, 2025 at 2:51 AM

Buhane Information Technologies

@buhane.com.tr

Modern Dataframes In Python: A Hands-on Tutorial With Polars And Duckdb Subtitle Level up your Python data engineering skills with this comprehensive, hands-on tutorial that explores.... @cosmicmeta.ai #PyDE

https://u2m.io/NHdnj8Wn

Modern Dataframes In Python: A Hands-on Tutorial With Polars And Duckdb

Explore modern DataFrames in Python with a tutorial on Polars and DuckDB, covering installation, performance, code examples, and expert tips for data engineering.

cosmicmeta.ai

November 22, 2025 at 12:08 PM

AWS News(Unofficial)

@awsnews.bsky.social

AWS Glue launches Amazon DynamoDB connector with Spark DataFrame support

https://aws.amazon.com/glue/ now supports a new Amazon DynamoDB connector that works natively with Apache Spark DataFrames. This enhancement allows Spark developers to work directly with Spark DataFrames, t...

#AWS #AwsGlue

AWS Glue launches Amazon DynamoDB connector with Spark DataFrame support

https://aws.amazon.com/glue/ now supports a new Amazon DynamoDB connector that works natively with Apache Spark DataFrames. This enhancement allows Spark developers to work directly with Spark DataFrames, to share code easily across AWS Glue, Amazon EMR, and other Spark environments. Previously, developers working with DynamoDB data in AWS Glue were required to use the Glue-specific DynamicFrame object. With this new connector, developers can now reuse their existing Spark DataFrame code with minimal modifications. This change streamlines the process of migrating jobs to AWS Glue and simplifies data pipeline development. Additionally, the connector unlocks access to the full range of Spark DataFrame operations and the latest performance optimizations. The new connector is available in all AWS Commercial Regions where AWS Glue is available. To get started, visit AWS Glue https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-dynamodb-dataframe-support.html.

aws.amazon.com

November 22, 2025 at 3:05 AM

AWS News Feed on 🦋

@awsrecentnews.bsky.social

🆕 AWS Glue introduces a DynamoDB connector for Spark DataFrames, enabling seamless code reuse across AWS Glue, EMR, and other Spark environments, simplifying data pipeline development and unlocking full Spark optimizations. Available in all AWS Commercial Regions.

#AWS #AwsGlue

AWS Glue launches Amazon DynamoDB connector with Spark DataFrame support

AWS Glue now supports a new Amazon DynamoDB connector that works natively with Apache Spark DataFrames. This enhancement allows Spark developers to work directly with Spark DataFrames, to share code easily across AWS Glue, Amazon EMR, and other Spark environments. Previously, developers working with DynamoDB data in AWS Glue were required to use the Glue-specific DynamicFrame object. With this new connector, developers can now reuse their existing Spark DataFrame code with minimal modifications. This change streamlines the process of migrating jobs to AWS Glue and simplifies data pipeline development. Additionally, the connector unlocks access to the full range of Spark DataFrame operations and the latest performance optimizations. The new connector is available in all AWS Commercial Regions where AWS Glue is available. To get started, visit AWS Glue documentation.

aws.amazon.com

November 22, 2025 at 2:40 AM

AWS Snarkbot

@aws-snarkbot.lastweekinaws.com

AWS Glue launches Amazon DynamoDB connector with Spark DataFrame support

AWS Glue finally lets you use standard DataFrames with DynamoDB instead of their proprietary DynamicFrame nonsense. Only took them how many years to support what literally every other Spark environment does by default?

November 22, 2025 at 2:10 AM

AWS What's New Skeetbot

@aws-skeetbot.lastweekinaws.com

AWS Glue launches Amazon DynamoDB connector with Spark DataFrame support

AWS Glue now supports a new Amazon DynamoDB connector that works natively with Apache Spark DataFrames, eliminating the need for Glue-specific DynamicFrame objects. This enables code reuse across platforms.

November 22, 2025 at 2:10 AM

pyladiescon.bsky.social

@pyladiescon.bsky.social

✨ Excited for Vitoria Rodrigues’ talk in the Others (General Interest) track on Dec. 5 at #PyLadiesCon!

She’ll introduce PySpark for beginners and show how data engineers use it. Curious about distributed processing and DataFrames? Check it out! 💜

#PyLadies #Python

November 21, 2025 at 12:00 PM

Datenpfad

@datenpfad.bsky.social

R-Code für das Erstellen des Dataframes und das Formatieren der Daten.

November 20, 2025 at 5:48 PM

Robin Blythe

@rbly.bsky.social

#rstats fam: Is there a 'tidy' way to fit ~50 {{brms}} models to 50 nested dataframes (same priors, same model, same everything except the data) without having to recompile every time? Any good tutorials/vignettes out there for doing this quickly/in parallel?
#statsky

November 20, 2025 at 3:46 AM

Towards Data Science

@towardsdatascience.com

Learn the foundational skill of creating Pandas DataFrames from multiple data sources. Ibrahim Salami's newest article includes clear, step-by-step instructions for initializing DataFrames from Python dictionaries, NumPy arrays, and CSV files, essential for any aspiring data analyst.

The Absolute Beginner’s Guide to Pandas DataFrames | Towards Data Science

Learn how to initialize dataframes from dictionaries, lists, and NumPy arrays

towardsdatascience.com

November 18, 2025 at 7:18 PM

Gaël Varoquaux

@gaelvaroquaux.bsky.social

@skrub-data.bsky.social: better data-science primitives for clean code on dataframes

Watch my dotAI talk, it's fun (live coding)!
www.youtube.com/watch?v=bQS4...
skrub really makes it easy to do machine learning with dataframes

Clean code in Data Science - Gael Varoquaux - Skrub DataOps, Probabl:

YouTube video by dotconferences

www.youtube.com

November 17, 2025 at 5:07 PM

Emily Riederer

@emilyriederer.bsky.social

Mostly a review of pretty standard methods, but some fun examples of enabling expression expansion (the magic behind column selectors!), more complex objects in a dataframe (models, vectors), and breaking the paradigm to go back to partitioned dataframes + list comprehensions

(2/2)

November 16, 2025 at 4:15 PM

OneMind-DataScience

@omdatascience.bsky.social

A minimum of bade R. Recognise objects (vectors, dataframes, list) is important

November 15, 2025 at 2:20 PM

rtomek.bsky.social

@rtomek.bsky.social

10-15 years ago, yeah R was clearly better. There’s still a couple things that R is useful for, so I use a library that will call R functions from Python using my pandas dataframes. Python is the way forward.

November 15, 2025 at 9:03 AM

dumb genius/smart idiot

@andrew.mcguiregis.com

Yeah, there's a mix of "Sync" and "sync" in the field I'm filtering by, and dataframes are case-sensitive

If I pipe in a QC = Sync | QC = sync, it puts them in two separate columns, and I need them in the same column.

So right now I'm .upper()-ing everything then filtering on QC = 'SYNC'

November 15, 2025 at 4:57 AM

Akin Unver

@akinunver.bsky.social

Part 2 - Create/modify Comma Separated Values-based dataframes: www.youtube.com/watch?v=q5Y6...

November 14, 2025 at 12:20 PM

Kyle R. Conway

@k-rey-c.social.coop.ap.brid.gy

Today I'm parsing datasets with #python and #pandas and coming to the conclusion that it would be convenient to be better versed in #matplotlib (or something else to produce charts from dataframes).

Does anyone have any personal recommendations to quickly get myself up to speed and producing […]

Original post on social.coop

social.coop

November 13, 2025 at 7:14 PM

JingleLu 🦌

@lulucastairs.bsky.social

Demasiado bien estoy absorbiendo toda la movida esta de los dataframes trabajando full time

November 12, 2025 at 4:30 PM

Robin

@mempler.de

Oh, i feel that… with polars when dealing with huge dataframes…

November 12, 2025 at 8:15 AM

objet petit ott

@isheoughtto.bsky.social

yea reasonably good at turning out dumb little snippets of well-trod code and smoothing over shit like "fuck how does this work in pandas vs spark dataframes again"

November 11, 2025 at 9:17 PM

RKiel

@rkiel.bsky.social

Bin gerade über die Möglichkeit gestolpert, Spalten aus Dataframes auf die Art

df$Spaltenname <- NULL

zu entfernen.

Das erscheint mir extrem praktisch. Kennt jemand irgendwelche Nachteile dieser Methode?

November 10, 2025 at 4:39 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news