I am slowly building a sort of atom/universe generator/visualizer in Rust with friends. The project is called Oxidizy.
I started this project many years ago but now that Bevy is an available game engine that makes ECS a breeze I decided to go back and optimize the universe generator.
The first thing was to tackle multithreaded mutations and then once that was at a reasonable state we moved on to adding more layers to the generator.
Now we are at the point where memory is starting to become an unfortunate contstraint even on a machine with 32GB of RAM (DDR4 3200MHz C16).
The program (unigen, not the simulator) at max workload with my hardware outputs the following:
$ ./scripts/generate.sh 360
--------------------------------
Threads: 16
Building..
--------------------------------
Universe built!
--------------------------------
Field is Anionic
--------------------------------
Atoms: 46656000
Baryons: 11010816000
Quarks: 33032448000
--------------------------------
real 0m5.797s
user 0m0.000s
sys 0m0.031s
That’s a lot of Quarks! 33 billion..
Before the quark optimization (in this PR) we capped out at 5.6 billion.
We reduced the memory footprint 6.6 times. This runs at/around the same speed as the prior 5.6 billion Quark runs. Which is an added bonus since it increases the load on all threads now that we are increasing processing on the CPU (all logical cores). The CPU bump is more unique to our application and not a consequence of using enums. We just chose to keep the original structures to infer the enum type. More tedious but it will help when we introduce algebra.
Decreasing the memory profile by utilizing enums and additional processing improved the performance 4 times over. That’s a rare outcome for sure.
Here is a basic Enum in Rust:
#[derive(Debug, Copy, Clone)]
pub enum Apple {
Green,
Red,
Yellow,
}
Say we grab a bunch of random Apple
s out of a basket. You can inspect the Apple
and see that it’s either: Apple::Green
, Apple::Red
, or Apple::Yellow
.
That’s a pretty powerful construct. No need to store strings, or ints, or booleans, or anything really.
You can now just store imaginary words that your editor can infer and that you can also read sensibly.
That Apple
enum is 1 byte. You can add say 20 other imaginary things to the Apple
and it will still be 1 byte.
Something like so:
#[derive(Debug, Copy, Clone)]
pub enum Apple {
GreenAndFresh,
GreenAndNotFresh,
RedAndFresh,
RedAndNotFresh,
YellowAndFresh,
YellowAndNotFresh,
}
Now you can inspect a single Apple
enum and have it be possibly 3 different colors as well as 3 different states of freshness, but it will always be one of the 6.
This is really fun for matching, especially with tuples!
Important side note:
If you go beyond the C style enum and start storing more complex variants your enum size will vary.
Here is a great rundown on Stack Overflow: Enum Size Rundown
A very useful function when optimizing: std::mem::size_of
Let’s do something similar with a Carrot
:
#[derive(Debug, Copy, Clone)]
pub enum Carrot {
OrangeAndFresh,
OrangeAndNotFresh,
PurpleAndFresh,
PurpleAndNotFresh,
YellowAndFresh,
YellowAndNotFresh,
}
Now you can have a basket of Apple
s and Carrot
s of different states.
Say you are executing a function called inspect_an_apple_and_a_carrot
:
let my_food_basket = (Apple::RedAndFresh, Carrot::PurpleAndNotFresh);
match my_food_basket {
(Apple::RedAndFresh, Carrot::PurpleAndNotFresh) => println!("find a fresh purple carrot"),
(Apple::RedAndFresh, Carrot::PurpleAndFresh) => println!("go pay at checkout"),
_ => println!("not sure what to do"),
}
Cool, let’s go over why that saved us a ton of space.
A more traditional yet maintainable approach you would do something like:
#[derive(Debug, Copy, Clone)]
pub struct Apple {
pub color: Color,
pub freshness: Freshness,
}
Where Color/Freshness is an Enum similar to Color::Red
/Freshness::Fresh
.
A quick and easy struct while having less inferance from your editor would be:
#[derive(Debug, Copy, Clone)]
pub struct Apple {
pub color: String,
pub freshness: String,
}
An optimized version of that:
#[derive(Debug, Copy, Clone)]
pub struct Apple {
color: u8,
freshness: u8,
}
Here a u8
is a cheap memory saving trick while still having to map things out and not have as much intellisense. While the editor will know it’s a u8
you’ll have to memorize what 0, 6, 11, or 24 means.
Whereas with an enum you just know because it tells you.
With the enum we now have half the footprint as using the u8
s.
Since we have to store 2 u8
s in the low memory struct version, that’s two bytes.
With the enum we can store all 6 potential different states as 1 byte.
Pretty cool!
This PR in Oxidizy introduces a work in progress of this refactor.
Essentially additional enums were made to create a representation of a created Proton
/Neutron
. So we still create the original elements on the fly to have all the correct business logic in place, then we infer from the created object the representation of that data that we will store in RAM. The created object that is not stored now dissapears, reducing the memory footprint. Protons
is a two field struct with a count and a default array of 118 ProtonData::Unknown
s.
What is ProtonData
? That was the made up abstraction to the Proton
objects themselves:
#[derive(Debug, Copy, Clone)]
pub enum ProtonData {
Unknown,
RedUpUpDownQuark,
BlueUpUpDownQuark,
GreenUpUpDownQuark,
AlphaUpUpDownQuark,
}
Something similar was done with quarks, and an enum called QuarkData
was made. This makes processing a Proton
quite simple matching a 3 element tuple:
impl ProtonData {
pub fn new(proton: Proton) -> Self {
let first_quark: QuarkData = Quark::data(proton.quarks.0);
let second_quark: QuarkData = Quark::data(proton.quarks.1);
let third_quark: QuarkData = Quark::data(proton.quarks.2);
match (first_quark, second_quark, third_quark) {
(QuarkData::RedUpQuark, QuarkData::RedUpQuark, QuarkData::RedDownQuark) =>
ProtonData::RedUpUpDownQuark,
(QuarkData::BlueUpQuark, QuarkData::BlueUpQuark, QuarkData::BlueDownQuark) =>
ProtonData::BlueUpUpDownQuark,
(QuarkData::GreenUpQuark, QuarkData::GreenUpQuark, QuarkData::GreenDownQuark) =>
ProtonData::GreenUpUpDownQuark,
(QuarkData::AlphaUpQuark, QuarkData::AlphaUpQuark, QuarkData::AlphaDownQuark) =>
ProtonData::AlphaUpUpDownQuark,
_ => ProtonData::Unknown,
}
}
}
This same logic is being implemented for Neutrons
as well since they are made of Quarks
.
Here is QuarkData
for further clarification:
#[derive(Debug, Copy, Clone)]
pub enum QuarkData {
Unknown,
RedUpQuark,
RedDownQuark,
BlueUpQuark,
BlueDownQuark,
GreenUpQuark,
GreenDownQuark,
AlphaUpQuark,
AlphaDownQuark,
}
So there you have it.
Increase CPU a bit, decrease mem allocations by a significant amount by utilizing C style enums, and a faster program emerges!
]]>So I waited a bit for Catalina to mature before upgrading my Mac. It will keep you on bash
if you upgrade but it will only have zsh
if you do a clean install.
I tested out zsh
on my Linux machine (Ubuntu 18.04LTS) and discovered some neat things. Once I felt comfortable I changed the default shell on Catalina:
chsh -s /bin/zsh
First of all, no more real need to worry about a .bash_profile
/.zprofile
as the .zshrc
behaves like a .bashrc
.
Let’s say I like have a terminal prompt like so:
dom.events (master) $
Where the cursor is one space after the dollar sign.
Let’s look into taking a function that is popular (parse_git_branch
) and using the easy to read color schemas for zsh.
setopt PROMPT_SUBST
parse_git_branch() {
echo $(git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/ (\1)/')
}
prmpt() {
current_dir=$(basename $(pwd))
current_branch=$(parse_git_branch)
prompt="%F{yellow}${current_dir}%f"
if [[ $current_branch != '' ]]
then
prompt="${prompt} %F{green}${current_branch}%f"
fi
prompt="${prompt} $ "
PROMPT=$prompt
}
precmd() {
prmpt
}
Here we see a function called precmd
. According to the zsh documentation on sourceforge (the link is plain text http not https):
precmd
Executed before each prompt. Note that precommand functions are not re-executed simply because the command line is redrawn, as happens, for example, when a notification about an exiting job is displayed.
So this is exactly what we want! After we change a directory, switch branches, execute a command, we get a consistent update that is in tune with our environment!
I also like the %F{yellow}%f
notation. The capital F%
is the begining of a color block and the lower case %f
is the end. So you can easily dictate when the color changes.
Hope this post made switching over a bit easier. I use two really helpful aliases when updating my rc files, these two are specific to the .zshrc
file.
alias zrc="code $HOME/.zshrc"
alias zgo="source $HOME/.zshrc"
This way I can open, edit, and make available any change I am working on in my rc file.
]]>In a previous post I wrote about writing a concurrent TCP server in go.
However one thing that is unfortunate is that nc
or netcat
is not native to Windows, and might also not be avaible on lightweight containers.
So in this post we will talk about writing a TCP client that can be compiled to windows/darwin/linux/etc..
All code will be explained using comments in the code block
package main
import (
"bufio"
"fmt"
"log"
"net"
"os"
)
func main() {
// ip and port of TCP server
ip := "127.0.0.1"
port := "8081"
// format ip and port
addr := ip + ":" + port
// dial into the tcp server and have access to the connection via `conn`
conn, err := net.Dial("tcp", addr)
if err != nil {
log.Fatal(err)
}
// block main() and set up a reader for stdin to read inputs from the shell
for {
reader := bufio.NewReader(os.Stdin)
// grab all text prior to hitting enter
text, err := reader.ReadString('\n')
if err != nil {
log.Fatal(err)
}
// send over payload to the connection
fmt.Fprintf(conn, text+"\n")
// read the response from the server
// if you modified your server to not send a response you can omit everything below
message, err := bufio.NewReader(conn).ReadString('\n')
if err != nil {
log.Fatal(err)
}
fmt.Print(message)
}
}
Great! Now you have a simple client that does mostly what we use netcat for anyways.
If your TCP server is being interacted with from a windows client you will have to add some logic to check for \r\n
instead of just \n
.
Setting ENVs on Windows can also be a pain so here we will add usage of the flag
parser built into the Go std lib.
I have only added comments where we add new functionality:
package main
import (
"bufio"
"flag"
"fmt"
"log"
"net"
"os"
)
func main() {
var ip string
// make a CLI flag for the IP address
// go run main.go -ip=10.0.0.42
// default is "127.0.0.1"
flag.StringVar(&ip, "ip", "127.0.0.1", "ip addr of TCP server")
var port string
// make a CLI flag for the IP address
// go run main.go -port=9000
// default is "8081"
flag.StringVar(&port, "port", "8081", "port of TCP server")
// full custom use of both ip and port: go run main.go -ip=10.0.0.42 -port=9000
flag.Parse()
addr := ip + ":" + port
conn, err := net.Dial("tcp", addr)
if err != nil {
log.Fatal(err)
}
for {
reader := bufio.NewReader(os.Stdin)
text, err := reader.ReadString('\n')
if err != nil {
log.Fatal(err)
}
fmt.Fprintf(conn, text+"\n")
message, err := bufio.NewReader(conn).ReadString('\n')
if err != nil {
log.Fatal(err)
}
fmt.Print(message)
}
}
So now you can have a nice CLI interface. You can also ask for “help” by doing:
go run main.go -h
-ip string
ip addr of TCP server (default "127.0.0.1")
-port string
port of TCP server) (default "8081")
So without having to think about how to make a nice output block, the flag
lib has got your back.
Hope you learned how to use flag
as well as how to make a crossplatform TCP client that will work anywhere you have Go, or if you cross compile your binary, anywhere you please!
As I have been writing more Go lately, I have found myself using Go as a scripting language more and more.
Replacing old bash or ruby scripts in Go has given me some really nice advantages.
Multi platform binaries.
You can compile Go bins to any platform of choice where you would otherwise develop software or automate some tasks.
Go is very easy to install on Linux/macOS/Windows and I, for ADHD and technical reasons, use all three platforms.
Why Go?
Between all OSs, having to pick between WSL/Git Bash/MinGW/Bash/Zsh/etc.. can be confusing.
This isn’t just for me either. Others might be running shells that don’t support &&
or all they have is powershell
.
I typically like to use native shells, so I needed something akin to docker for scripting.
All of a sudden you can write an easily installable language, that has excellent support in VSCode.
The built in flag
lib makes for some neat self documenting CLIs.
Is it a bit more verbose? Sure. Does it run faster? Sure. Is it more convenient when swapping environments? Absolutely!
Something about the zen of Go, and the plethora of built in std lib features like http/os/flag/sync/goroutines/etc.. make it really easy to convert common bash scripts into a staticly typed script that can be compiled and shared.
Even if you are not compiling the scripts, go run cmd/script/main.go
is convenient and still really fast.
Sometimes bash is the clear winner. You want to automate a task or have a special build for an Elixir/Rails/Spring project in Docker. The image you are going to use more than likely won’t have Go in it but bash will be there. Don’t add friction!
Or you are writing a Jenkins/Travis/GitLab CI job and a few curl
/grep
/sed
commands will do just fine.
Sometimes ruby can be simpler but you can typically always replace a ruby script with Go unless you are using some quality of life gems that would make writing something in Go a nightmare.
Instead of just writing apis, you can finally have some fun and learn other areas of the language. It really has fantastic documentation and there is so much offered without needing external packages.
I won’t replace all of my scripts in Go, but this is a great way to sharpen the Go knife and make life simpler!
]]>Today we will learn how to write a TCP server in Go that is concurrent in nature. This will enable this server to handle more than one connection. This server will also know if a client disconnects without asking the server to close the connection.
package main
import (
"log"
"net"
)
func main() {
// boot up tcp server
listener, err := net.Listen("tcp", "127.0.0.1:8080")
if err != nil {
log.Fatal("tcp server listener error:", err)
}
// block main and listen to all incoming connections
for {
// accept new connection
conn, err := listener.Accept()
if err != nil {
log.Fatal("tcp server accept error", err)
}
// spawn off goroutine to able to accept new connections
go handleConnection(conn)
}
}
Ok, so for now everything is quite simple. Your server listens on localhost:8080. Then you wrap all new connections in a for {}
to keep your server alive forever.
Now you accept the client connection. An easy way to connect would be: nc localhost 8080
But what does handleConnection
do? And why use a goroutine?
Great questions!
If we don’t use a goroutine, we will not be able to accept another client. Otherwise the function hangs in our for {}
block until the client leaves!
So here we just preface handleConnection
with go
and now we are able to handle multiple clients.
Now let us define this handle connection. Comments in the code will explain the functionality!
func handleConnection(conn net.Conn) {
// read buffer from client after enter is hit
bufferBytes, err := bufio.NewReader(conn).ReadBytes('\n')
if err != nil {
log.Println("client left..")
conn.Close()
// escape recursion
return
}
// convert bytes from buffer to string
message := string(bufferBytes)
// get the remote address of the client
clientAddr := conn.RemoteAddr().String()
// format a response
response := fmt.Sprintf(message + " from " + clientAddr + "\n")
// have server print out important information
log.Println(response)
// let the client know what happened
conn.Write([]byte("you sent: " + response))
// recursive func to handle io.EOF for random disconnects
handleConnection(conn)
}
package main
import (
"bufio"
"fmt"
"log"
"net"
)
func main() {
listener, err := net.Listen("tcp", "127.0.0.1:8080")
if err != nil {
log.Fatal("tcp server listener error:", err)
}
for {
conn, err := listener.Accept()
if err != nil {
log.Fatal("tcp server accept error", err)
}
go handleConnection(conn)
}
}
func handleConnection(conn net.Conn) {
bufferBytes, err := bufio.NewReader(conn).ReadBytes('\n')
if err != nil {
log.Println("client left..")
conn.Close()
return
}
message := string(bufferBytes)
clientAddr := conn.RemoteAddr().String()
response := fmt.Sprintf(message + " from " + clientAddr + "\n")
log.Println(response)
conn.Write([]byte("you sent: " + response))
handleConnection(conn)
}
Say you want to delete all branches that start with feature-
.
You can use: IO.read
git branch | grep feature | elixir -e 'IO.read(:all) |> IO.puts'`
Ok so now that you know this is going to work, we can write the actual script:
git checkout -b feature-new-branch \
&& git checkout master \
&& git branch \
| grep -v '* master' \
| grep 'feature-' \
| tr -d '\n' \
| elixir -e 'IO.read(:all) |> String.trim("\n") \
|> fn args -> "git branch -D #{args}" end.() \
|> to_charlist |> :os.cmd |> IO.puts'
So now you can start piping your heart out!
You might notice:
IO.read(:all)
|> String.trim("\n")
|> fn args -> "git branch -d #{args}" end.()
There is an anonymous function that takes in the pipe output and executes itself (think IIFE in JS).
This is the easiest workaround to not making a variable! :pray:
Here is the output:
Switched to a new branch 'feature-new-branch'
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
Deleted branch feature-new-branch (was a4144dd).
I really do enjoy using ruby to do this as well, but you end up using a ton of semi-colons (;
) and defining really small cryptic variables.
Otherwise your “one liner” becomes quites long.
I am sure awk
and sed
can continue to be used instead, but sometimes using elixir is more fun! :tada:
To be fair using tr
and xargs
this problem is solved in a much simpler fashion:
git checkout -b feature-new-branch \
&& git checkout master \
&& git branch \
| grep -v '* master' \
| grep 'feature-' \
| tr -d '\n' \
| xargs git branch -D
:joy:
]]>Ok so for a while now I have been using Task.async/1
and Task.await/1
with piped Enum.map
(s) to get some concurrent work done.
Imagine api_call/1
exists and makes an HTTP request somewhere
0..20
|> Enum.map(fn idx ->
Task.async(fn -> api_call(idx) end)
end)
|> Enum.map(fn task -> Task.await(task) end)
|> IO.inspect
However TIL about Task.async_stream/3
!
You can go read about it here: Task.async_stream/3
Runs through your enum in chunks (equal to the number of logical cores on your machine) to execute the tasks as fast as possible! :rocket:
By default the maximum number of tasks to run at the same time will equal the result of System.schedulers_online
.
Here it is being used in comparison to the earlier snippet:
0..20
|> Task.async_stream(fn idx -> api_call(idx) end)
|> Enum.map(fn {:ok, result} -> result end)
|> IO.inspect
Much better! :tada:
Reminds me of par_iter
in the Rayon Rust crate: par_iter
Not only is it a win in performance, but also a win in code clarity and cleanliness :smile:
]]>These days a lot of hardware has moved from RS-232 over to USB.
So one day my dad told me he wanted to have a way to click a file or a button and turn one of his ICOM radios into USB mode or to FM mode.
I made a little template file for him in Powershell, where he could just change two variables and would be ready to go!
Here is the repo: icom-cmd
Yea, sooo coming from Web development this was weird. Apparently you have to send over a byte array that has hex codes in it and there is an actual API spec to follow.
With the ICOM protocol:
FE FE
.FD
.Ok that’s cool. There is a defined protocol!
My dad knew both commands, and explained how to read the API (docs for the API in the repo).
Mock! Mock the whole environment. So I googled around, and at least on Windows there are two neat tools for this.
My dad uses Win10 so I had to be sure to write this in something that comes standard with his machine.
RealTerm can listen to the ports (kinda like a wireshark for USB) and com0com makes a virtualized port so you can have an input and output stream instead of a single connection to the port with no way of seing what you sent.
It took a while but you end up learning about how to open a connection to the open port, and send the payload over:
$serialPort = "COM3"
$cmdString = "FE FE 94 E0 26 00 05 00 01 FD"
$bins = New-Object System.Collections.Generic.List[System.Object]
$hexes = $cmdString.Split(' ')
foreach ($hex in $hexes) {
[Byte] $converted = [int]"0x$hex"
$bins.Add($converted)
}
[Byte[]] $binaries = $bins
$port = new-Object System.IO.Ports.SerialPort $serialPort, 9600, None, 8, one
$port.open()
Start-Sleep -m 100
$port.Write($binaries, 0, $binaries.Count)
Start-Sleep -m 100
$port.Close()
Once I was able to confirm the payload I was sending was correct, I would email the script to my dad and he would try it out.
It worked! He was now able to double click the file on his desktop and get it to work!
Eventually I ended up building an exectuable in go/js that he could run that has an actual frontend and can save commands (localStorage/a file on his computer).
Golang was actually great for this. Pretty much everything in go is a byte array of some kind :joy:, so it was a great way to get comfortable with the lanugage!
Here is the repo for that project: hmrcmd
]]>TIL about $?
:thinking:
Just as confusing as it looks, but it makes a good bit of sense once you learn it.
Say you run: echo 'hello' | grep 'ello'
That will have an exit code of 0
(no failure, ran successfully).
Say you have a file answer_to_life.txt
. We can grep the contents of the file and use the $?
to see if grep succeeded in finding our matching pattern.
So you can do something like:
cat answer_to_life.txt | grep '42'
if [ $? -eq '0' ]
then
echo 'The answer to life is: 42'
else
echo 'Apparently the answer to life is not 42'
fi
We have the grep run, check for the exit code of the latest operation and if it equals "0"
, then we print our known fact. Otherwise we print that we guessed wrong.
Just a neat little trick. Should be useful for a lot of things! :tada:
]]>TIL about a command line util called entr
:tada:
Website: entrproject
mac: brew install entr
ubuntu: sudo apt install entr
This will watch any filename you pipe into entr. Then it can run any script you pass it with -s
or just completely reload a process (ctrl - c and start again) using the -r
flag.
Perfect for scripting or spiking parsing weird payloads :rocket:
Example scripts:
ls *.rb | entr -r ruby main.rb
echo 'script.js' | entr -r node script.js
echo 'main.go' | entr -sr \
'docker stop $(docker ps -aq) && docker-compose up --build'
So useful!
I made a repo so I could script in a bunch of different languages with watch scripts for each one!
Check it out: dev.random
]]>