What I found interesting this time - Belgium chillout edition

by: Artur Dziedziczak

March 24, 2024

“DuckDB as the New Jq,” n.d. https://www.pgrs.net/2024/03/21/duckdb-as-the-new-jq/

Alternative to "jq" tool which allows parsing JSON by writing SQL like queries

“Releasing Common Corpus: the Largest Public Domain Dataset for Training LLMs,” n.d. https://huggingface.co/blog/Pclanglais/common-corpus

It looks like people continue to try creating common dataset for training of LLM that is public, do not rely on common crawl and respect copyright.
Whenever I see such a project I get curious what’s the actual source of the data, who validated it and is it really so transparent?
I suggest checking it yourself as I don’t really have time to dig into it :\textbar
Still, I think this again shows that OpenAI, Microsoft, Midjourney, and other GenAI companies simply work on stolen content and should pay giant fines.

“My First Steps in Meshtastic,” n.d. https://stfn.pl/blog/28-intro-to-meshtastic/

The author describes his experience with Meshtastic.
If you would like to buy yourself this dooms day communication device, I think this blog pretty well describes how lonely and hard the whole journey is ;)

“Guess Who’s Back? Exodus Scam BitCoin Wallet Snap!,” n.d. https://popey.com/blog/2024/03/exodus-wallet-part-three/

Small blog post about analysis on newly published crypto wallet applications.
Most of them are available in snap store and from what author noticed they do only one thing, and they don’t do it well.
When you provide your data, the application is sending your wallet ID and password via plain HTTP to the attacker.

“Precision Agriculture: Crop Mapping Using Machine Learning and Sentinel-2 Satellite Imagery,” n.d. https://arxiv.org/pdf/2403.09651.pdf

Science time!
Simple research on agricultural crops detection.
Basically, researchers took some Sentinel-2 images and trained different algorithms like convolutional U-Net, decision trees and logical regression to detect where are the lavender fields.
I remember using these methods for my Master Thesis as again it’s super surprising that random forest algorithm can be as good as convolutional neural network.

“Parsing URLs in Python,” n.d. https://tkte.ch/articles/2024/03/15/parsing-urls-in-python.html

Python library for URL parsing "can_ada" seems to be 2 times faster than "urllib" and "ada_python".

“Sensible Firefox Setup,” n.d. https://vermaden.wordpress.com/2024/03/18/sensible-firefox-setup/

Pretty good summarization of what plugins you should use for Firefox.

“Science, Deceit & Healthcalre: Navigating The Minefield of Alternative Medicine with Prof. Michael Baum,” n.d. https://open.spotify.com/episode/7ASDM1qFVR8vJ1MsPfxt1E?si=6Tfb66lOSh2XxvkSnhJdHA

Really interesting discussion about space for alternative medicine in science.
tldl; there is a space for checking placebo effects and learning from quackery doctors on how to care for a patient