Information Found in May 2024

Published May 30, 2024
Updated Nov 3, 2025
7 minutes read
Note

This old post is translated by AI.

##Books Read This Month

###Why Your Work Never Ends - Satoshi Nakajima

A book called Rocket Start Work Technique that teaches you to try most assigned work in the first few days and then tell your boss how many days until completion. Embarrassingly, I didn't know who Satoshi Nakajima was until reading this book. After reading, I followed his Twitter and newsletter.

The book's tone is quite easy to read. Important content is persistently repeated in plain writing, and you can tell he really wants readers to understand and practice the Rocket Start technique. It feels like the author Nakajima-san's personality shows through.

###How We Learn

I've been reading a lot of self-help type books on recent Kindle sales. But this book was more of an education theory book than I expected, so at first I thought I'd made a title-buy mistake. It turned out to be quite interesting though.

Being written by a researcher, it's written very logically based on evidence from experiments, so the content is difficult.

What I found interesting was this experiment: A puzzle task where you combine scrambled parts to solve. What intervention makes solving faster? An experiment on subliminal effects showed that just giving hints to the puzzle speeds things up. So just skimming information is actually educational. I've been needing to input a lot of information efficiently lately, and it made me feel better knowing that even skimming counts.

That said, being characteristic of this research field, there were some "isn't that logic strange?" moments. For example, this logic: When criticizing politics, you don't necessarily need a counter-proposal. When served bad food, do you need to provide a good recipe to just say "this is bad"?

I thought the above metaphor was "that's different though." For example, politicians should definitely present counter-proposals when criticizing the ruling party. For constructive policy discussion, criticism without alternatives often overlooks important context or makes unrealistic claims (especially left-leaning people), so this metaphor is definitely wrong. But thinking about it, would a professional chef say "it's bad" while presenting alternatives? Hmm.

###The Way People You Want to See Again Speak

A book about speaking with the other person as the main character. Many examples are introduced making it easy to understand. It seems practice is needed so I should use YouTube videos too.

I've been trying to speak with others as the main character lately.

##NewsPicks Dialogue Community Nurse

The community nurse topic I'd been hearing about from Coten Radio etc.

No particular updates from what I'd heard before, but having more people analyze the same phenomenon means hearing different expressions from different perspectives, which is very interesting.

##"How Do You Know So Much About That?" A Great Way to Use Google Search That Always Amazes Colleagues

I knew about Google Scholar alerts but didn't know about this. I set up a few.

I'm often asked "how did you find this information?" and this should make me even stronger. https://www.google.co.jp/alerts#

##Humans Don't Need to Write Queries Anymore! We Developed and Deployed a Slack Bot That Automates SQL Query Assembly

Interesting. Seems they're using Claude. There really are more solutions that are possible because of Claude Opus.

Embarrassingly, I struggled quite a bit making Slack bots, so I wanted to reference the GitHub repository, but it's implemented in Go.

##This Is How You Use Local LLMs

This is really interesting. Being able to visualize entropy is...

##Next-Gen Web Authentication "Passkeys"

An explanation of passkeys I've been seeing lately.

I've been using 1password for the past few years, and it subtly supports one-time password generation and can hold GitHub private keys, so it's very convenient. That 1password started showing "do you want to register a passkey?" on sites like Twitter. Registering this allows login without password input somehow. I understand passkeys use public/private keys and don't use passwords. Understanding this mechanism was helpful.

They mentioned private keys are held in the cloud, but I wondered if the service provider other than 1password would be Chrome or something. I looked it up but couldn't figure it out.

Passkeys are great so I'll actively use them going forward.

The "History of Encryption" introduced in this slide is also an interesting video.

##Seaborn Bug Caused Wrong Paper Data

Professor Okumura's tweet.

Honestly, it's impossible to examine all library internals, and dealing with each case is tedious, but risks exist. Is it a structural problem of libraries developed personally as open source without serious responsibility...?

At least what we can do is keep our antenna up and not miss news like this.

##Markov Chain Monte Carlo Method

##Python-Only Web App Libraries Are Increasing (2024.05)

https://qiita.com/SFITB/items/c3361979b86f441993ff?utm_campaign=popular_items&utm_medium=twitter&utm_source=dlvr.it

Python web development libraries like Reflex and Flet are emerging.

Previously, I tried making a business app with Streamlit. As a result, implementing login screens and Japanese input in tables required various customizations.

I really understand this. After joining Logomix, I developed somewhat serious business apps, and Streamlit felt lacking as web components. Sure, Streamlit regularly adds features, but ultimately it's just a kit for arranging pre-made menus.

I felt we needed to use general-purpose web app libraries, not dashboard libraries, so now I'm playing with Reflex.

I didn't know these libraries so I read their documentation.

  • Reflex

https://github.com/reflex-dev/reflex?tab=readme-ov-file

https://reflex.dev/

Seems you can generate quite rich UIs. You can create frontends React-style. While Streamlit emphasizes creating interactive apps, Reflex supports routing so it feels like a lower-layer web development kit.

https://github.com/flet-dev/flet

This is Flet. This seems to be a library that generates Flutter code.

Honestly, in this day and age, can't you make Flutter apps with AI? I thought.

##GUI Scheduled Backup Software Using OS Standard Powerful File Copy Command "robocopy"

Software that can set up GUI scheduled file backups on Windows. Noted for future use.

##Cas and Other Nucleases in Development

If I could get merchandise like this, maybe I'll take the test. Looking at past questions, it seems easy.

But if it's this easy, it's not worth putting on a resume...

##Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists

Fully managed RStudio has appeared on AWS SageMaker. Good thing.

I only use R for data wrangling, so unfortunately my local machine is sufficient.

##OpenAI Assistants API Quickstart

https://github.com/openai/openai-assistants-quickstart?tab=readme-ov-file

OpenAI released a simple AI assistant app as open source. The design is a bit cool, and it seems like a good entry point to start AI app development.

##Cell2Sentence

A model fine-tuned from GPT-2 and Pythia-160m. Selected 100 genes from single-cell data and trained on their expression levels and phenotypes. As a result, it can generate expression profiles from cell type (CD4 T-cell in melanoma) and perform multi-tasks.

But looking at the evaluation published on HuggingFace, it doesn't seem to have reached practical levels yet...

https://huggingface.co/vandijklab/pythia-160m-c2s

https://github.com/vandijklab/cell2sentence-ft/tree/main

As an AI beginner, I had to start by learning what CLIP is, but I got the overview from a past Matsuo Lab student's materials.

CLIP trains on paired image and text data to classify images with text. The same could be done with regular deep learning as I understand, but the notable point is using transformer for the model.

For Biology, cell images would be interesting. Training on genes and corresponding cell images to output abnormal gene names from cell photos.

##BrainPad New Employee Training Materials

Excellent basic statistics materials. Essences are well summarized, giving the impression that someone who understands statistics is there.

Good for digging into each topic as needed, or for those who've already studied to quickly review.

##Interview with Former Chinese Public Security Defector

I think it's common knowledge that China does things like this (suppressing individuals engaged in anti-government activities residing in foreign countries through operatives). But when circumstantial "must be doing it" becomes "really doing it" through internal testimony, the emotional understanding feels like it goes up a level.

Either way, high risks should be addressed, so there should be no change in US-Japan government response and risk management.

##Bioinformatics Algorithms

file:///C:/Users/dakes/Downloads/Bioinformatics%20Algo'ms-%20An%20active%20learning%20approach.pdf https://womengovtcollegevisakha.ac.in/departments/Bioinformatics%20Algo'ms-%20An%20active%20learning%20approach.pdf

https://compeau.cbd.cmu.edu/online-education/bioinformatics-algorithms-an-active-learning-approach/

##Mutant CD47 Evades Macrophage Phagocytosis Even in Presence of CD47 Antibody (Nature, 2024 last week)

https://www.nature.com/articles/s41586-024-07443-8 Research from Stanford. It starts with the hypothesis that combining anti-CD47 antibody, actively being developed by Gilead, AstraZeneca, etc., with CAR-T would provide high anti-tumor effects. CD47 antibody aims to stop cancer cells' don't-eat-me signal for anti-tumor effects, but it also inhibits CAR-T's don't-eat-me signal, making combination difficult. So they created CD47 mutants with point mutations and successfully avoided binding with CD47 antibody.

I was going to present the above at journal club, but I scrapped it.

I also researched CD47 antibody development status, but it wasn't clear if it was going smoothly. A 2022 article showed Pfizer enthusiastic about development, but recently seems to be toning down.

That said, Pfizer didn't repeat its 2022 claim that CD47 represents a peak sales opportunity of $3 billion or more, and yesterday's presentation made no mention of ontrulpacept, a second SIRPα fusion protein derived from Trillium.

Gilead's Magrolimab is also still on clinical trial hold, and CD47 antibodies seem somewhat cloudy.

##Let's Use pathlib More

I'm quite a pathlib fan, but I don't see it much in online code, which is lonely.

This article introduces a wide range of tricks I didn't know either. A good article. Here's a usage I personally like that wasn't in this article. No need for with open, so it's very easy.

from pathlib import Path
 
p = Path("hoge/hogehoge.txt")
text_data = p.read_text()

##Resources When You Want to Build an LLM from Scratch

I want to try building an LLM from scratch. I'm studying.

##AlphaFold3

AlphaFold3 with improved accuracy was announced, but not releasing source code is apparently very controversial. Looking at supplemental data, the pseudocode apparently only contains docstrings. However, there seem to be people trying to reproduce it as open source. Interesting.

But after the Nature paper was published, all the rebuttals and everything are published in Nature. The excitement is so much it looks like self-promotion...

##Roasting Tokyo Metropolitan Government's Generative AI Use Case Collection

I'd been hearing rumors that recent ChatGPT already has instructions built in without prompting, but this note explains it very clearly.

##Samba-1 Turbo

Another inference-specialized platform has appeared. You can try it on Samba-1's HP. I thought it might be faster than Grok, but with the following prompt, Groq was 1270.05 TToken/sec and Samba-1 was 1026.15 Token/sec (Llama3-8B)

##Polars Japanese Community

polars-jp was born. Polars, the ultra-fast version of pandas, has been around for about 5 years I think, and I'd tried it when I was at my previous job.

But it wasn't clear if polars would replace R's dplyr in the future, and I was seeking the same interface as R at the time, so I only adopted it for 2-3 projects and haven't used it since.

However! The recently published "Data Preprocessing Encyclopedia Second Edition" apparently completely revised the first edition that explained with dplyr to use polars!

This is a tremendous trend. Is the era finally coming when polars replaces pandas?