What I found interesting this time

by: Artur Dziedziczak

February 18, 2024

“HexChat 2.16.2, The Final Release,” n.d. https://hexchat.github.io/news/2.16.2.html

Looks like maintainer of HexChat did the final release.
I’m too young to actually remember IRC. I know that people still use it, but because of other communication tools which are simple to connect to, it’s not so popular anymore.
I tried HexChat a couple of years ago and the number of plugins and customizations you could make to it was just insane.
It was a great project. Maybe some community will for it. Will see.

“Automated Unit Test Improvement Using Large Language Models at Meta,” n.d. https://arxiv.org/pdf/2402.09171.pdf

Meta aka Facebook made some research on LLMs and unit tests generation. It’s a pretty good paper, but I also found some issues with the numbers.
I’m not a scientist, but there is something fishy about the numbers and claims of the paper.
It’s written that a "clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination".
When you read what those filters are, you find out that there are tree different filters. First, one checks if the code builds. The second test checks if generated code passes assertion and runs the suite 5 times to remove flaky tests. Third test check if the generated test improves coverage percentage.
With all of this in mind, I would like to ask how these filters remove hallucinated code? From what I read, it only reduces the changes of hallucinated code going for code review.
This is actually also later written in the summarization of numbers. Meta evaluated these LLMs tools on different products and the results are generally good.
First for 42 tests, 4 were rejected and 2 were withdrawn. Reasons for that were that tests were generated for trivial method, had multiple responsibilities or failed to include test case. So, the third one is hallucination? I would say so.
Next there is another study on 280 diffs. 64 were rejected, 61 had no review and 11 were withdrawn, but it’s not mentioned why 64 were rejected. If it’s same as for 42 I expect it was due to hallucination which Meta LLMs do not remove but limit to some extent.
Ok, no more criticism! I actually like when companies publish studies like this. It’s a good and healthy way to commit to the global scientific community. Good work Meta!

“Making My Bookshelves Clickable,” n.d. https://jamesg.blog/2024/02/14/clickable-bookshelves/

Someone created a clickable bookshelf with SVG polygons, Grounding DINO, Segment Anything Model(SAM) and GPT-4.
I think this part where GPT-4 was involved was not necessary. It was used as an OCR API and there are already algorithms and models which allow making OCR really reliably.
It’s still an amazing project. Good stuff!

“The Decline of Usability: Revisited In Which We Once More Delve into the World of User Interface Design.,” n.d. https://www.datagubbe.se/usab2/

Interesting rant about usability issues in current UI.
It’s a good read. I learned from it about:
"Skeuomorph" - it’s a derivative object that retains ornamental design cues from structures that were necessary in the original (Wikipedia).
"Fitt’s law" - a predictive model of human movement used in human-computer interaction and ergonomics.
In the post, authors compares some UIs and rants how bad they are now. What I would like to do now is to argue about current changes and its direction.
The first rant is about “colorful icons”. I disagree with the statements he makes. Colorful icons are for sure a good usability fit. Sadly, they break immersion within the app. Many colors all around are difficult to combine within some brand app.
Let’s imagine Spotify. You cannot put the "play" icon green and "stop" icon red within the Spotify UI, just because it would look like bad. I think nowadays there is a blurred line between usability and design that sometimes is crossed by designers to make something that looks good. It’s not the best way, but it’s a tradeoff they need to do.
The second rant is about how good old UIs were. To back up this claim, author shows some old IRC client and mentions beauty and usability features of colorful icons of buttons. Next he mentions that nowadays "Slack" has these blend, dull icons which are not distinguishable. The design of them is also ambiguous. It’s not clear what those buttons do.
Well, I looked at your IRC app and I can tell you that I have no clue what your buttons are doing either.
I think the author here do not fully grasp the idea behind icon buttons that evolved from old times. New icons should have tooltips or labels and when the screen gets smaller only icon persist. I, personally, think this is the best compromise between usability and design freedom. Your users will find a way to use your program. Just give them a bit of time, and they will click thought app and remember the steps to achieve expected results.
I’ll finish this quick note with a claim made by the author followed by my comment.
"All the while I’m thinking: If modern application design is so great, why does everyone feel the need to change it all the time? " - and my answer to it is: Because we are fucking grumpy apes that will always complain. The older we get, the more grumpy we are and the more sick of changes we get. Grab a glass of whiskey or some good quality orange juice and enjoy the ride!

“Video Generation Models as World Simulators,” n.d. https://openai.com/research/video-generation-models-as-world-simulators

OpenAI released their SOTA text to image generation and it’s scary AF.
First of all, as always, there is no paper for it.  
Second of all, there is no mention of sustainability.
Lastly, there is no mention of training data source.
Again, even without all of these I’m really impressed by the results. A couple more real papers and we will have some extremely good results.

“How To Center a Div,” n.d. https://www.joshwcomeau.com/css/center-a-div/

The best article I ever read about centering a div.

“Why CMake Sucks?,” n.d. https://twdev.blog/2021/08/cmake/

Massive rant about CMake build system and I fully agree with the author.
Maybe I’ll provide some background to that. During my university times, I had to program a lot in C. Understood virtual functions, templates, pointers but! What I never understood is the build system of it.
It’s insanely convoluted. CMake projects I made were usually once setup, and then I was reusing the same template for any other project I did.
I still use this template just to not go back to CMake documentation. It’s too complicated and difficult to read that I never found motivation to actually learn it.
There is also no point in learning it. Learning build tools should be only necessary for big projects. For CMake, you need to know almost everything from the start.
I love Cbut the build tools for it are just a nightmare. I haven’t checked if something changed recently, but author suggest using mason or bazel. Maybe I should try it.
Or better, learn Rust.

“(Plausible) Random Geography Generation with PostGIS: Fluviation,” n.d. https://di.nmfay.com/random-geography-fluviation

Someone used PostGIS to generate random terrain with simulation of rivers. Fascinating project!

“The Last Dance : Robust Backdoor Attack via Diffusion Models and Bayesian Approach,” n.d. https://arxiv.org/pdf/2402.05967.pdf

Reading about backdoor attacks on machine learning models is something I can’t stop doing.
What I find interesting about it is that there are so many vectors of attack. You can poison test dataset, training dataset or even try to break working model without any poisoning.
This attack presents a technique which allows to poison training data of speech recognition system. Such poisoned training data can lead to a model that in common usage works normally, but as soon as an attacker creates the own query it will behave differently.
This particular attack is only for text recognition system, but such attacks can happen for any diffusion model or LLMs.
Fascinating read!