Mimoune Djouallah
mimdj.bsky.social
Mimoune Djouallah
@mimdj.bsky.social
#MicrosofFabric Customer advocate, interests in Small Data & Self Service #Microsoftemployee since Dec 2023 , but my tweets are my own
Explaining how Python engines read and write #DeltaTable is not for the faint of heart.
The theory is everything will depends on the delta kernet rust for read and write, but we are not there yet
github.com/djouallah/Fa...
#duckdb #delta_rs #datafusion #chdb #daft #polars #rust #lakesail
October 27, 2025 at 10:53 AM
October 19, 2025 at 2:05 PM
you are looking at #duckdb running tpch 1 TB with only 16 cores
it used to crash even with 64

pip install duckdb --upgrade is an act of faith basically
October 10, 2025 at 5:04 AM
October 3, 2025 at 12:18 PM
Put together a small python package duckrun :) point it at a folder of SQL/Python files, define a pipeline, and it will create Delta tables in #OneLake with #DuckDB and #delta_rs

github.com/djouallah/du...
October 3, 2025 at 11:17 AM
actually #Microsoftfabric Datawarehouse automatically expose an Iceberg rest Catalog
thanks to #duckdb UI extension, you can see proper catalog
September 25, 2025 at 12:57 PM
2 months ago, I got access to a beta release of #onelake #Apacheiceberg REST Catalog, first thing I run it with #duckdb 😀
September 16, 2025 at 12:49 PM
storage format should not be tied to #SQL logic, #duckdb got it so right !!! but a bit sad that #deltalake is left behind :(
September 15, 2025 at 11:30 AM
first #apacheiceberg table written by #duckdb
September 6, 2025 at 12:07 PM
good news #duckdb added support for reading and writing geometry data type

Bad news : other Fabric engines don't support it yet, so it is not very useful for now :(
September 5, 2025 at 1:17 PM
September 1, 2025 at 10:00 AM
Writing #ApacheIceberg in Azure is not particularly hard, but you do need a catalog (essentially a database). For simple tests, you can use an in-memory DB
#ADLS #opentableformat #PyIceberg.
August 13, 2025 at 1:17 PM
I hope this is a fair subjective assessment of the current state of #Python data processing engines
August 12, 2025 at 1:41 PM
The first solar farm I worked on , 8 years ago :)
#PowerBI #Python
djouallah.github.io/AEMO-POWERBI/
June 1, 2025 at 7:04 AM
#Powerbi compression is insane !!!
April 27, 2025 at 12:49 PM
how to connect to #onelake using duckdb UI

step 1 : install azure cli in your laptop
step 2 : login to your account
step 3 : load credential in Duckdb

CREATE or replace SECRET onelake(
TYPE azure,
PROVIDER credential_chain,
CHAIN 'cli',
ACCOUNT_NAME 'onelake'
);

enjoy
March 15, 2025 at 5:58 AM
this is a very nice problem to have in #PowerBI :)
January 7, 2025 at 11:47 AM
We don’t know how it happened, but one day in 2025, the test completed without errors on a single node. Nothing was ever the same after that
January 1, 2025 at 1:44 AM
worst thing about iceberg is solved !!! #duckdb
December 30, 2024 at 11:31 AM
ok, maybe my laptop is horrible, but reading the data using mounted storage from #onelake is faster than reading the same data stored in my laptop, WTF !!!
December 29, 2024 at 3:21 PM
ok you don't need java for reading s3 table from duckdb, but a bit worried about speed !!!
December 21, 2024 at 6:00 AM
in my personal tenant (melbourne region)
December 10, 2024 at 6:19 AM
managed to write to s3 table, it is not too hard when you know where to look, does anyone knows, if we can read the data or list the files ? the internal s3 bucket don't seems to be accessible ?
December 7, 2024 at 1:40 PM
ok, you can create a namespace and a table name using boto3, it will create an internal bucket, but it seems not to be accessible
December 4, 2024 at 6:07 AM
out of core for the win #duckdb #Microsoftfabric
December 3, 2024 at 4:15 AM