What I found interesting this time - Hackaton edition

by: Artur Dziedziczak

March 17, 2024

Hackaton

I haven’t read much this week due to internal work at Hackaton and laziness after it

The hackathon went really well. Our team did not win anything, but I think we did the most creative project in the whole company. The project was about visualizing different data sources within map polygons generated from a voronoi diagram

I cannot share screenshots of it as it was internal work, but believe me. It looked really good and was super functional

I know it’s not much, but

I would like to thank Monika, Guilherme, and Pedro for staying up late and working on our hackathon project. We really did something great, and I highly appreciate your work

Reads update

“Street Scene Demo,” n.d. https://jamesgurney.substack.com/p/street-scene-demo

Great demo of how to paint super bright light scenes.
I still can’t figure out how people paint watercolor with such vibrant colors.

“What Was Your Prompt? A Remote Keylogging Attack on AI Assistant,” n.d. https://cdn.arstechnica.net/wp-content/uploads/2024/03/LLM-Side-Channel.pdf

Science time!
Interesting offensive security paper which describes an attack on SOTA LLMs services.
tldr; they capture packets that you receive which interacting with ChatGPT and then use LLM to predict what it returned. Because almost all LLMs use tokens to transmit words in real time, this simple technique works rather well when applied to commercial LLMs solutions.
When it comes to numbers, it looks like :
"Using these methods, we were able to accurately recon-struct 29% of an AI assistant’s responses and successfully
infer the topic from 55% of them."
29% is not much, but still, it is a lot when you consider that this attack can work for any conversation that you sniff from the network of victim.

“How Figma’s Databases Team Lived to Tell the Scale,” n.d. https://www.figma.com/blog/how-figmas-databases-team-lived-to-tell-the-scale/

Blog post from the Figma database team where they show how with PostgreSQL and internal tooling you can make sharded databases.

“Processing One Billion Rows in PHP!,” n.d. https://dev.to/realflowcontrol/processing-one-billion-rows-in-php-3eg0

Amazing blog post about different optimizations you can make in PHP to run your code faster. Now, I get that not everyone is a fan of PHP. This post is different, though.
I think everyone should read it to understand the crucial parts of early optimizations:
- IO and disk reads/writes
- using references
- optimizing conditions
- multithreading (yes PHP can run stuff in parallel!)
In the end, author managed to move his naive implementation that took more than 20 minutes to 27.7 seconds!

“Using LLMs to Generate Fuzz Generators,” n.d. https://verse.systems/blog/post/2024-03-09-using-llms-to-generate-fuzz-generators/

The author describes how to LLM helped him to create fuzzer for his data format.
Before reading it I had no idea what a fuzzer is, so I checked it on Wikipedia.
“fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program”
With this in mind, I was able to read through the blog post and get some knowledge.
In the end, author suggest that LLM hallucinate a lot and for custom format it was not a breeze. Still, he’s very promising that in the future it might bet better.