Author: Robert

  • The HP T740

    Just like most homelabbers, I like mini PCs. They are cute, use very little power and are quiet, at least in idle and under light loads. The Intel NUCs were at the forefront of this revolution and it’s quite unfortunate that Intel has discontinued that division. The newer generations of Mini PCs seem to have somewhat more flexibilty using extra m.2 slots and proprietary expansion modules, but that’s usually it. No PCIe slots that would make them truly universal.

    This is where HP’s T740 thin client differs.

    The T740 is a thin client, at least on paper. It is the top-end model of their TX40 series and is aimed towards a very specific segment, which is graphics-intensive computing environments. It has four Display Port outputs for driving four 4K screens, which can optionally be increased to six, if one opts for a preinstalled AMD Radeon E9173 GPU that adds two more DPs. And this has already given away the secret, which is that it has a PCIe expansion slot. Yes. In a thin client.

    The default configuration with the memory shield removed

    That alone wouldn’t make it truly one of a kind, but in addition, it also boasts a quad-core AMD Zen-based CPU with up to 64GB of RAM. If HP were to put the same specs in one of their ProDesk desktops, it would fit right in with the rest of them. This is practically desktop level power.

    • AMD Ryzen V1756B CPU, 3.25Ghz base, up to 3.6Ghz with Vega 8 iGPU
    • Two DDR4 SODIMM slots for up to 64GB of RAM
    • Two 2280 M-key M.2 slots, one for NVMe and one for SATA drives
    • One 2230 E-key m.2 slot for a wireless card

    The Ryzen V1756B is the embedded, soldered equivalent of the Ryzen 5 2400G, odenamed Great Horned Owl. It is a 14nm chip from Q1 of 2018, so while it isn’t exactly a recent chip, it still has decent performance.

    The AMD Raven Ridge die, from FritzchensFritz on Flickr

    The design looks very similar to HP’s other contemporary thin clients, and its BIOS copies their other thin clients as well, which shouldn’t be a surprise. However, we still get support for virtualization and some more advanced setup options.

    The entire desktop Thin Client lineup from HP with the T740 on the right

    I’ve had the opportunity to pick up three of them for a reasonable price and so I did. They were configured with 1x4GB RAM and 16GB of eMMC flash, which is a bit like the Steam Deck’s 64GB drive. It’s an NVMe drive, but it doesn’t come anywhere close to saturating those speeds and is actually slower than a SATA drive would be, it’s just using PCIe instead of SATA.

    I ended up doing very different things with the three units. One became a pfSense box, another a mini gaming PC and the last one is just a HTPC, which doesn’t really utilize the PCIe slot. I actually had a T640 for that up until recently, so it essentially replaces that.

    ServeTheHome has a great article on the T740, but I did find some conflicting info. They measured an idle power consumption in the 19-20W range, while I measured around 10-12W when booted into the factory ThinPro OS with BIOS v01.15 reset to defaults. This seems to be too big of a difference to be caused by variations in the testing equipment, so I think the power can be brought down to around 10W to be in line with most other thin clients. The fan seems to be always-on by default, I haven’t tried if it can be turned off in idle, but it’s completely inaudible, so it doesn’t really bother me. In fact, if there is a PCIe card installed, the extra airflow can be quite useful to ensure things don’t heat up too much in the inside.

    The router

    I’ve been using pfSense for a long time, but I’m certain that any other router OS would run great on this thing. I’ve added a Mellanox ConnectX-3 Pro NIC with two 10Gb SFP+ ports. I’m not entirely certain if the CPU is capable of routing 10Gb/s of traffic, as I haven’t had a chance to test it with such a high speed connection. If pfSense can’t do it, then TNSR should be able to. FreeBSD-based systems may need some tweaks to boot up, but that’s just a one-off setup.

    The router with an especially low profile ConnectX-3 Pro

    I used a 128GB SATA drive with MLC flash to ensure it lasts and added 2x4GB of RAM, alongside the ConnectX-3. Te integrated gigabit ethernet controller is a Realtek 8111, which is not everyone’s favourite, but I’ve used a mini PC with two of them as a pfSense router with no issues whatsoever. I don’t know if 2.7.0 includes the drivers for them, or you still need to install them separately, but they do work, so that extra potr can be used for either local LAN access or for a second WAN connection. This config idles around 20W with no added optics or other SFP modules in the Mellanox NIC.

    The HTPC

    This is the simplest one, I just pulled the eMMC and replaced it with a 256GB NVMe drive. The integrated Vega 8 GPU is fast enough to handle some of the more taxing upscaling and sharpening filters in madVR and it can output a 24Hz signal as well for 24p Blu-ray playback. All of this in 4K, of course. Passive DP to HDMI adapters work well, as all four are DP++ ports, but those can only do 1080p. If you want to connect a 4K display, you need to use a DP cable or an active DP to HDMI 2.0 adapter.

    The gaming PC

    I know one wouldn’t normally think of a thin client as a platform to run games on. You might use it for streaming games to it, or maybe run some old retro titles locally, but that’s not my intention. I wanted to put together the fastest T740. That required the T740, of course, and an AMD W6400, which is currently the most power-efficient HHHL (half-height half-length) GPU. It’s about as fast as a 7970, which is another card I used to have a long time ago, except that easily pulled in 300W of power.

    The W6400 in its full glory, a standard HHHL-sized card

    Along with the W6400, I installed 2x8GB of 3200Mhz DDR4 RAM and a 2TB NVMe SSD. There were some problems, however. The BIOS does not recognize the W6400 as a GPU, as that’s not what HP used to bundle with the device and therefore the option to use the external GPU for booting and showing the BIOS screens is simply not possible. However, that’s okay, Windows will happily output an image through the card’s connectors, once it’s booted. The other, more serious issue is with driver compatibility. I could not get two copies of the official AMD drivers installed and working at the same time. This is a problem, because the integrated GPU can’t be disabled and if the driver isn’t loaded properly, sleep doesn’t work and the iGPU pulls more power than necessary in idle, leaving less for the CPU cores. You see, one reason for using a dedicated GPU is the extra graphics power it adds, but another side effect is being able to leave the integrated GPU in idle, so that the CPU cores can utilize the power that would have otherwise been used up by the iGPU.

    Of course, adding things up, the 45W TDP of the CPU and the 50W board power of the W6400 is already at 85W, which is dangerously close to the 90W rating of the external power supply. So it should be no surprise that this config was pulling 116W from the wall while playing Tomb Raider. Through the 90W PSU. And then it dropped down to 40-50W once the power supply’s built-in protection kicked in. Then back to 115W and again 50W. Don’t try this at home. I think I nearly killed that PSU. This was with the balanced power profile in Windows and the broken Vega 8 driver.

    There is an unofficial third party AMD driver from Amernime Zone, which allowed me to install both the Vega 8 and W6400 drivers and have them working at the same time. With this combo, the idle power was at 22-23W in the power saver profile and 28W in the Ryzen Balanced profile, which was installed by the AMD chipset drivers.

    But that’s not the interesting part. Even with the proper drivers, the balanced profile allowed the CPU to reach 3.6Ghz and use up to 115W during games and 3DMark. To counter this, using the power saver profile with the max CPU speed set to 99% helps to maximize the power consumption around 85W, which is perfect. We can happily draw up to 95W from the 90W PSU, so 85W is not an issue, even for extended periods. But how does this affect performance? Well, below are some scores ffrom a number of 3DMark runs that compare the balancedprofile with power saver, maximized at 99% CPU speeds.

    Fire StrikeGraphicsPhysicsCombinedOverall
    Balanced120771055030559169
    Power Saver11962737127908411
    delta-1.0%-30.0%-8.7%-8.3%

    In Fire Strike, the physics scores are most affected, which is expected, as that’s the most CPU-bound test. The overall score only went down by 8.3%, which is not a very significant difference.

    Time SpyGraphicsCPUOverall
    Balanced335732093333
    Power Saver349924343283
    delta+4.2%-24.2%-1.5%

    Time Spy looks a bit different, as the graphics scores went up a bit, this is because that test pulls the most power and it triggered the PSU’s overcurrent protection multiple times during the test, causing the GPU to drop many frames. The difference in CPU performance is similar, but the overall score only went down by 1.5%, which is barely noticeable.

    A larger PSU could probably help, but I don’t know how much thermal load the chassis could safely dissipate. 85W is probably okay, but 115-120W may be too much.

    I would like to note here that despite the insufficient PSU, I had no issues with installing a graphics card whatsoever. Some people reported that their T740 didn’t turn on with an RX6400 installed1 and I also found some suggestions2 that the voltage regulator supplying the PCIe slot is only rated at 3A at 12V, which woudl maximize its capacity at 36W.

    I think I managed to invalidate both of these claims, but I do not suggest that either of these statemets are false. There may be BIOS or hardware revisions that refuse to boot with an RX6400 and there may also be hardware revisions which are indeed limited to supplying 36W to the PCIe card.

    Closing thoughts

    I think the T740 is a great and incredibly versatile mini PC. Thanks to the potent CPU, iGPU and PCIe expansion slot, the possibilities are endless. It’s also a very low power platform. Thanks to the soldered CPU and lack of chipset, the idle power without an expansion card is around 10W and the lack of chipset doesn’t really come with any disadvanteges here. We still get two 10G and two 5G USB3 ports and three USB2s, which should be more than enough for most use cases. All of this coupled with full-fledged virtualization capabilities make this a great all-around box that can handle anything.

    But as with everything there are a few negatives as well. First of all, the fan. It never stops. And while some extra airflow is a good think if you have an expansion card installed, it would be nice to have an entirely passive mode when the system is idling, simply to reduce the amount of dust it sucks in.

    The power supply may also be an issue if a dedicated GPU is installed, but a 120W model can be had for £20, if needed. Although I would be a bit worried about the chassis’ ability to dissipate that much heat.

    And speaking of heat, the CPU fan, which is the only fan in the machine is controlled by the CPU temperature. If the expansion card heats up, then tough luck, the CPU fan won’t go any faster and that heat will be trapped inside the case. If you play games and evenly load the GPU and the CPU, then this shouldn’t be a problem, because the load on the CPU will speed up the fan, which will exhaust the heat generated by the GPU as well. The original card that was bundled with the system had a blower-style cooler, which directed most of the heat it generated straight out of the system through the backplate, which solved the issue, but I couldn’t find an RX6400 like that and the W6400 only comes in a single AMD variant.

    By the way, I think that AMD variant is actually the best one of all of the low-profile designs of Navi 24, as it actually includes a copper heatpipe that spreads out the heat along the full length of the heatsink, whereas I couldn’t see that on any of the RX6400 cards. The downside is that AMD didn’t allow zero RPM mode on the W6400, so even if there is no load, the fan will run at a very low RPM. This is inaudible, but it will increase the dust buildup on the heatsink.

    1. https://www.reddit.com/r/homelab/comments/xg313i/hp_t740_thin_client_with_radeon_rx_6400_gpu/ ↩︎
    2. https://www.parkytowers.me.uk/thin/hp/t740/ ↩︎
  • Genuine Intel(R) CPU 0000

    For most people, the above probably doesn’t mean much, but those who have a hunch that this article will be about an Engineering Sample are right. I found a cheap Chinese mATX motherboard on Aliexpress which is really interesting for two reasons. One has already been explained by the title, the next will be obvious when you look at the image below.

    Yes, there is no socket. The board has an 11th gen 45W Tiger Lake CPU soldered onboard, which you’d normally find in higher-end gaming laptops. Otherwise, it’s a perfectly normal mATX motherboard, uses standard DDR4 sticks and you can plop in expansion cards. The heat spreader has been specially designed to allow the mounting of a standard lga115X cooler, which means one can use a standard CPU cooler to keep the chip under control. Underneath, there is a perfectly normal H-series Tiger Lake chip with a separate chipset on the board, which is not visible here.

    It works quite well, but for whatever reason, I couldn’t get suspend to work properly in Windows (and I haven’t tried Linux). The BIOS seems to be a bit generic with options that don’t explain much on what they do, or in some cases they may not do anything at all. I also noticed that the 10Gb/s USB3 ports are a little finicky. Some devices, like JMS583-based NVMe to USB3.1 gen2 adapters don’t get recognized and it’s the newer A3 revision of the JMS583 that shouldn’t have these problems. And if you like to listen to music, you should really get an external sound card, as the analog output of the integrated one is really bad.

    Aside from these quirks, it’s really cheap for the performance you get and it’s also fairly power efficient. I did run some Cinebench tests, so have a look below:

    R11.5R15R20R23
    Single20.301861450111678
    Multi2.5122056414
    Multiplier8.088.477.997.98

    The multiplier is always very close to 8 and that’s with hyper-threading enabled, which means the single-core turbo is quite aggressive, as hyper-threading normally gives a 30-50% advantage compared to a single thread per core setup.

    The board has been in a PC for the past 6 months and it’s been performing really well, so aside from what I mentioned above, there are no stability issues or other problems I have encountered.

  • Making a 6×4″ eInk picture frame

    My grandmother’s birthday is coming up and I was wondering what to give her when I remembered this video I watched a few months ago on an eink picture frame. I checked pimoroni’s site and they actually had larger, 7″ variants in stock, which can neatly fit in a 10x15cm (6×4″) picture frame, so I thought I would give them a try and ordered one.

    They have two flavours available, one comes with a 40-pin header for a raspberry pi and the other has a pre-soldered pi pico W. I chose to go with the former, as it has no added bezels and I had a pi zero w laying around anyway. Power comsumption wasn’t really a factor as it can be plugged into mains and the pi zero allows me to run syncthing to upload new pictures to the device remotely, without having to write any additional code to do it.

    The eInk screen, a 7.3″ Inky Impression

    The sccreen is pretty big and while the 800*480 resolution is quite high for a colour eink panel, my Kobo Aura One has a 1872*1404 panel, albeit that’s only grayscale. So there is definitely a tradeoff between the resolution and the ability to display colours.

    Speaking of colours, the way these colour eink displays work is quite interesting. Unlike LCD, OLED and pretty much any colour display panels, there are no subpixels. Instead, each pixel can be any of the seven colours, which are black, white, red, green, blue, yellow and orange. Seemingly, there is also no way to create gradients within a colour, so it’s either fully green or not green. There is no inbetween, and so for the gradients, the screen uses dithering, which can be visible from shorter distances, but from about a metre away, I can’t see the individual pixels anymore.

    The colours are nice, not quite as vibrant as real ink, but having this amount of colour is way better having none at all.

    The build

    Setting it up is quite simple, just burn a copy of raspbian OS to a microSD card, I used Pi OS Lite as I didn’t need the graphical interface, and set up the wlan using sudo raspi-config.

    Pimoroni provides a github repository with everything needed to make the display work and the also have a script that installs and sets up the required software and libraries, which is what I used. Just run curl https://get.pimoroni.com/inky | bash and give it around 15 minutes to do its thing.

    Once done, you can use the image.py script in Pimoroni/inky/examples/7color to display an image. It automatically resizes them too. To use it, just run ./image.py <image_path> <saturation> where the image_path is the image path and saturation is a decimal number between 0 an 1 and it makes the image more or less colourful. 0.5 is the default when the parameter is omitted, as it is optional.

    Syncthing

    To set up syncthing, just follow the instructions below, or on the official site. This should work on all debian and ubuntu based distros.

    # Add the release PGP keys:
    sudo curl -o /usr/share/keyrings/syncthing-archive-keyring.gpg https://syncthing.net/release-key.gpg
    
    # Add the "stable" channel to your APT sources:
    echo "deb [signed-by=/usr/share/keyrings/syncthing-archive-keyring.gpg] https://apt.syncthing.net/ syncthing stable" | sudo tee /etc/apt/sources.list.d/syncthing.list
    
    # Add the "candidate" channel to your APT sources:
    echo "deb [signed-by=/usr/share/keyrings/syncthing-archive-keyring.gpg] https://apt.syncthing.net/ syncthing candidate" | sudo tee /etc/apt/sources.list.d/syncthing.list
    
    sudo apt-get update
    sudo apt-get install syncthing
    
    # To automatically start the service on boot
    # replace myuser with the user you want to run syncthing as
    # you can use admin, or create a separate user for the service
    sudo systemctl enable --now syncthing@myuser.service

    As the GUI is only visible on the localhost by default, we need to change the config files to make it accessible from other computers on the local network. To do this, the ~/.config/syncthing/config.xml file needs to be amended. The address has to be changed form 127.0.0.1 to 0.0.0.0 in the section below.

    <gui enabled="true" tls="false" debugging="false">
        <address>127.0.0.1:8384</address>
        <apikey>k1dnz1Dd0rzTBjjFFh7CXPnrF12C49B1</apikey>
        <theme>default</theme>
    </gui>

    After the file is updated and saved, restart the service with sudo systemctl restart syncthing@admin.service. Replace admin with the user you’re running it as. You should now be able to connect to the GUI of syncthing from another PC’s web browser by navigating to port 8384 of the Pi’s IP.

    Setting up automatic updates

    For these, we’ll need a bash script and a systemd service and timer. The shell script will take a folder path and the location of the python script that updates the eink display. Both of these need to be the absolute paths in the filesystem. It also has a saturation and a picture_buffer_size variable, which define the saturation value that’s used when invoking the script and a buffer, which prevents an image from appearing again for a set number of updates. This is done by appending a number to the filename and reducing it by one every time the script runs, then removing it completely when it gets to zero and only allowing the selection of new images from non-suffixed files. It is important to note that the folder must contain at least one more file than the buffer size, otherwise no updates will happen for a number of refreshes if all of the available images had been used recently. Once created with the contents below, the script needs to be made executable using sudo chmod +x <filename>.

    #!/bin/bash
    
    # these need to contain the absolute paths, starting from root '/...'
    # the location of the python script that updates the eink screen
    SCRIPT_LOCATION=/home/admin/Pimoroni/inky/examples/7color/image.py
    # the location of the folder containing all of the pictures
    PICTURES_LOCATION=/home/admin/photos
    # the desired saturation of the screen, the default is 0.5
    SATURATION=1
    # the number of days/updates for which a picture is not displayed again
    # MUST BE LESS THAN THE NUMBER OF PICTURES AVAILABLE IN THE FOLDER
    PICTURE_BUFFER_SIZE=3
    
    echo 'Updating picture...'
    
    chosen_picture_path=''
    
    # randomly choose an image file without an appended index
    get_random_file () {
    chosen_picture_path=$(ls $PICTURES_LOCATION/*.jpg |sort -R |tail -1)
    }
    
    # decrease the value of the suffixes on all images and remove them if they are 1
    reduce_indices () {
    for i in $(seq 0 $PICTURE_BUFFER_SIZE)
    do
            for filename in $PICTURES_LOCATION/*.jpg.$i; do
                    if [ -e $PICTURES_LOCATION/*.jpg.$i ]; then
                            if [ $i -lt 1 ]; then
                                    picture=$filename
                                    old_location=$picture
                                    new_location="${old_location:0:${#old_location}-2}"
                                    mv $old_location $new_location
                            else
                                    picture=$filename
                                    old_location=$picture
                                    new_location="${old_location%.$i}.$((i-1))"
                                    mv $old_location $new_location
                            fi
                    fi
            done
    done
    }
    
    # add a numerical suffix to the randomly selected image, like an extension
    add_index () {
    old_location=$chosen_picture_path
    new_location="${old_location%.$i}.$PICTURE_BUFFER_SIZE"
    mv $old_location $new_location
    }
    
    reduce_indices
    
    get_random_file
    
    echo 'The new file is '$chosen_picture_path
    
    python $SCRIPT_LOCATION $chosen_picture_path $SATURATION
    
    add_index

    The file to create the systemd service needs to be named something like eink_update.service and has to be placed in /lib/systemd/system.

    [Unit]
    Description=The eInk image updater service
    
    [Service]
    Type=oneshot
    User=admin
    ExecStart=/home/admin/update_eink_screen.sh
    Restart=never

    And finally the timer to trigger the systemd service will go in the same folder with the eink_update.timer filename. They need to have the same name, as a timer will automatically be associated with the ~.service file of an identical name.

    [Unit]
    Description=A timer for the eInk image updater service to run it each day
    
    [Timer]
    OnCalendar=daily
    Persistent=true
    
    [Install]
    WantedBy=timers.target

    After these are created, just run sudo systemctl daemon-reload and then sudo systemctl enable --now eink_update.timer. Then you can list all the active timers with systemctl list-timers --all. If the eink_update.timer is there, then it shall be triggered automatically at every midnight to randomly update the picture on the screen.

    Adding a frame

    Of course it wouldn’t be complete without adding a frame, so I got one from H&M UK (This is not a referral link) with a paper insert that has a 10x15cm cutout, which is almost the right size, although it’s not perfect. A custom paper insert could correct that. I used some oil to make the grains of the wood pop a bit more and give the frame a darker hue.

    To make sure that the display stays in place, I used tape to temporarily (yes, I know it’s not going to be temporary) affix the screen to the paper insert and used pieces of a 15x2mm aluminium profile hold it in place. It’s quite easy to cut it to length with a hacksaw.

    There are two pieces which hold the screen and one that functions as a stand, which is a longer piece that I bent by hand using a pair of pliers. The shorter ones have been cut off at an angle on one side to make them easier to slide in, as below.

    The stand is a 25cm-long piece, bent to a shape that makes it hold the frame at a slight angle. The ends just slide into the grooves of the frame.

    And that’s it. I think it looks pretty good, although the back could be improved, but it’s not particularly visible, so I don’t really mind. A battery could be added to allow completely wireless operation, but I think the PicoW variant of the screen would be more suitable for that, as that uses a lot less power.

  • “thin” servers from Thin Clients

    The hardware

    I managed to snatch two Igel M350C thin clients off of ebay for a very reasonable ~$45 per piece, shipped. Which doesn’t sound all that impressive, but these actually have AMD Ryzen Embedded R1505G CPUs.

    Ryzen Embedded R1505G is a 64-bit dual-core embedded x86 microprocessor introduced by AMD in early 2019. This processor is based on AMD’s Zen microarchitecture and is fabricated on a 14 nm process. The R1505G operates at a base frequency of 2.4 GHz with a TDP of 15 W and a Boost frequency of up to 3.3 GHz. This MPU supports up to 32 GiB of dual-channel DDR4-2400 memory and incorporates Radeon Vega 3 Graphics operating at up to 1 GHz.

    This model supports a configurable TDP-down of 12 W and TDP-up of 25 W.

    Ryzen Embedded R1505G – AMD – WikiChip

    So, I think that helps to put it in perspective. These are very fast CPUs for a thin client, while only sipping about 4W of power in idle and still managing to be fanless. Not the usual atom or AMD G-series with low-performance Puma cores from 2014. Fun fact, the PS4 and Xbox One used the predecessors of these cores, codenamed Jaguar. That’s why their CPUs are relatively weak, even though they have 8 cores.

    The die of the CPU “Banded Kestrel”, courtesy of Fritzchens Fritz on Flickr

    This image shows that the CPU does have four physical cores and there are 11 CUs for the integrated GPU, but most of it has been disabled, as we’re “only” getting two cores and 3 CUs. That is still very much adequate for the target market and it allows AMD to still sell most of the defective chips instead of creating a smaller, true dual-core CPU design.

    The thin clients themselves come with two DDR4 sodimm slots, populated with 2x2GB modules, which isn’t much, but at least the memory is not soldered. What is soldered, however, is the 8GB internal eMMC drive and there is no way to replace it or install another drive. That’s a problem. On one hand, it won’t use much power, but on the other hand, I’ll need to add external storage. Fortunately, there are three high speed USB3 ports, so there is an easy, but not so elegant way to do that. The internal heatsink is pretty large and there does seem to me an unpopulated mPCIe slot, although unpopulated in a way that the slot itself is missing and only the solder pads are present.

    The internals

    They come with a proprietary Linux-based OS called Igel OS, which requires a license. However, it’s quite easy to format the internal eMMC and install any OS on it. Apparently, the BIOS password can be removed by resetting the BIOS using the onboard jumper, which is a joke, but hey, at least it saved me the time of cross-flashing the non-password-protected BIOS.

    Before I proceeded with the setup, I wanted to spin up Windows on one of them just for fun. You may be aware that it actually requires a 32GB hard disk, but there is an unofficial stripped-down variant of Windows 10, called Tiny10, which installed just fine and also had a spare gig or two on the disk. This can make a fun little thin client for Parsec or other RDP client with a really nice user experience, as the integrated Vega 3 GPU has hardware decoders for both H.264 and H.265.

    Out of curiosity, I did some power measurements on them and one of them in idle consumed between 3.9-4.4W, while two, connected to the same 12V power supply with a Y-splitter topped out at 6.5W, making it 3.3W per thin client. The reason why two don’t use twice the power is the conversion loss in the PSU. This is very close to the 2.7W of a Raspberry Pi 4B. The SSDs add an extra 0.7-1W per device when they are in idle, making the power consumption of a thin client with two SSDs ~7.2W.

    The plan

    I intend to use these thin clients as low power servers at remote locations. The aim is to make them fully autonomous and self-contained. I won’t be in direct control of the network environments of the sites where these will be deployed. It is also possible that they may be behind CG-NAT or dual NAT, so the possibility of setting up port-forwarding rules or UPnP should not be assumed.

    They may end up being thousands of miles away from me in the homes of not particularly tech-savvy people, so making them work under most conditions is a must, as they won’t be easily reachable for debugging.

    There are three things I would like to use them as:

    • Distributed file storage, using Syncthing
    • VPN exit nodes, using Tailscale
    • I might as well add a RIPE Atlas software probe

    I think it is important to note that while these will be used as file storage, they will not be storing the only copies of the files. Everything that will be mirrored to them will also be stored on another array on a NAS with redundancy. The point of these thin servers is not to provide redundancy on their own, but to add geo-redundancy by essentially cloning the data to sites at different locations, ensuring that there will be an least one copy remaining in the event of a natural disaster or other physical intrusion. The chances of both the NAS and the thin servers failing irrecoverably at the same time are not zero, but slim enough that I’m willing to take them.

    The storage

    As mentioned before, I needed some external storage so I used two 1TB Crucial MX500 SSDs for each thin client, connecting them with Sabrent EC-SSHD USB3 (5Gb/s) to SATA adapers. These are not a full enclosure, so they allow convection to cool the drives instead of isolating them inside a plastic casing and they also don’t add any extra thickness, which made it easier to mount them on the side of the thin clients with some zipties. Not particularly nice, but it works well enough and I don’t intend to put these on display.

    The SSD “mounts”

    The drawback is that like many other USB enclosures, these do reuse the same serial numbers, so out of the 5 adapters, three had the same serial number, which made Fedora think they were essentially the same drive. I managed to shuffle them around so that no two of those three were used by one thin client, but it’s something that could potentially cause a problem. I think that the different manufacturing batches probably used a different serial and I managed to get them from three different batches with serial numbers that were like [A, A, A, B, C], so I could pair them up as A-B and A-C, which worked.

    The SSDs are cheap consumer drives with TLC flash which I had lying around and since I won’t be writing huge amounts of data to them, they will be fine.

    These volumes will be encrypted, so that the files won’t be accessible if someone decides to plug them into their own PC. Yes, I know, I will need to store the decryption keys somewhere, which is likely going to be the internal eMMC and they could very well be extracted from there and thus used to unlock the drives, but someone would have to be in really desperate need of my files to go that far and I’m willing to take that risk. Syncthing also has a feature for adding untrusted devices and encrypting the data on them, although it is still in beta and may not be fully functional. This essentially encrypts the files on the destination nodes with a key that’s set on the source node.

    The internal 8GB eMMC will only be used as a boot drive and all containers and related data will be stored on the external drives.

    The execution

    I simply installed the standard Fedora Linux 37 Server Edition on the internal eMMC from a USB drive. UEFI boot works as expected and after the install and a dnf upgrade, the /boot partition is at 0.31/1.0GB and / is at 2.8/6.0GB. This isn’t too bad. I still have a few gigabytes left to install docker and I can move the container images to the external drives.

    It is important that all users including root have strong passwords, otherwise there isn’t much point in encrypting the external drives.

    Setting up the drives

    The ideal way to configure the drives would be to create a single volume, spanning across both disks and I was considering to utilize ZFS for that. Now, using external drives with ZFS, or really any other filesystem, for storing important data is generally a bad idea.

    Fedora also has support for Stratis which is a storage management tool and it’s already embedded into the OS. They also made it accessible from the GUI, which is quite convenient and it supports encryption as well, for which it uses the cryptsetup library and the LUKS2 format. It has support for using a TPM or a remote Tang server for handling the keys. Unfortunately, the thin clients don’t have TPMs and I wasn’t going to set up a key server for this either.

    In the end, I went with a simple software RAID0 pool and an XFS filesystem on top of that, encrypted using LUKS2. It’s simple and fast and I would probably not use the added capabilities of ZFS or Stratis anyway. The passphrase is stored in the keyring and is automatically loaded after boot, so the drives are automatically unlocked.

    The containers

    The first step was to install docker and then move the data-root directory of Docker, so that it stores everything on the external drives instead of the eMMC. This can be done as shown below.

    # stop docker
    sudo systemctl stop docker
    # create a new config file at /etc/docker/daemon.json
    sudo touch /etc/docker/daemon.json
    # add the below to the file using your favourite text editor
    {
      "data-root": "/<path_to_external_drive_array>/docker"
    }
    # save and restart docker
    sudo systemctl start docker

    All containers will need to be set to start automatically, for which there is a handy restart: always flag in compose.

    The easiest one is the RIPE Atlas software probe, for which there is a github repo with a very detailed tutorial. This doesn’t need anything special, the compose file can be started as-is with docker compose up -d. The below is what I used. This will create a public key file at /var/atlas-probe/etc/probe_key.pub which you have to register at ripe.

    version: "2.0"
    
    services:
      ripe-atlas:
        image: jamesits/ripe-atlas:latest
        restart: always
        environment:
          RXTXRPT: "yes"
        volumes:
          - "/var/atlas-probe/etc:/var/atlas-probe/etc"
          - "/var/atlas-probe/status:/var/atlas-probe/status"
        cap_drop:
          - ALL
        cap_add:
          - CHOWN
          - SETUID
          - SETGID
          - DAC_OVERRIDE
          - NET_RAW
        mem_limit: "64000000000"
        mem_reservation: 64m
        labels:
          - "traefik.enable=false"
          - "com.centurylinklabs.watchtower.enable=true"
        logging:
          driver: json-file
          options:
             max-size: 10m
        # security_opt:
        #   - seccomp:unconfined

    Syncthing also has a great tutorial in its official repo, so using the below compose file should yield a usable instance.

    version: "3"
    
    services:
      syncthing:
        image: syncthing/syncthing
        container_name: syncthing
        hostname: my-syncthing
        environment:
          - PUID=1000
          - PGID=1000
        volumes:
          - /<local>/<path>:/var/syncthing
        network_mode: host
        restart: unless-stopped

    You will need to allow the ports it uses through the firewall in Fedora, because the network mode is set to host. This can be done in Cockpit, just add the pre-defined syncthing and syncthing-gui services to the Public Zone, as shown below.

    Additionally, once it’s up and running, you can add your server/NAS to Syncthing and set up Auto Accept of shares, so that the client will automatically accept any shares that you share with it from the server.

    Tailscale

    I’ve attempted to use the tailscale container, but I’ve been unable to set up the firewall rules to make it work as an exit node. So instead of spending a few hours trying to figure out what’s wrong, I decided to use the .rpm package instead. In addition to the installation procedure, we need to configure the firewall according to this guide to allow the exit node to redirect all network traffic from the clients. This will also have the added benefit of allowing us to stop and restart docker remotely, if we ever need to.

    To authenticate the client, we have to execute the below after the set-up is complete. This will persist across reboots. For some reason, the machine name changes after a reboot to localhost-0, but we can manually override it in the tailscale GUI.

    # authenticate and enable exit node
    sudo tailscale up --advertise-exit-node

    It’s also important to disable the key expiry once the clients are added, so they won’t need to be re-authenticated every 90 days.

    Final touches

    Of course, all passwords need to be changed, both for the OS and the Syncthing GUI, including the root account, and stored, along with the drive encryption keys.

    The Syncthing GUI could also be disabled by removing the firewall rule after setting it up to prevent unauthorized access and prevent brute-forcing attempts.

    It is a good idea to set up the thin clients to turn on automatically when they are powered up, but how this can be done is highly dependent on the BIOS of the machine you’re using. They often have a very cut-down BIOS, but in my experience, this option is still present in almost all of them.

    I’m still yet to deploy them “in the field”, but I think the above should suffice in making them work. I managed to obtain some inexpensive 8GB DDR4 modules, so they got an upgrade to 16GB RAM each, which is nice, probably unnecessary. I’ll report back on how well they work once they are in place.

  • Deploying WordPress on ORACLE Cloud

    This is a guide on how to deploy WordPress (or any other container-based service) on ORACLE Cloud and set up a URL with a Dynamic DNS provider. You could use the guide for a Teamspeak or Minecraft server as well. We’ll use a free-tier ARM-based VM, which utilizes their Ampere Altra 80C CPU. The free-tier allows you to use 4 CPU cores and 24GB of RAM in total. This can be split between multiple VMs, such as 2/1/1 cores with 12/6/6GB or RAM respectively. The bandwidth you get is 1Gb/s per core, so for a 2-core instance it’s doubled.

    The guide is composed of four main parts:

    • Setting up an OCI account and the VM
    • Installing Docker and Docker compose with a management UI
    • Installing WordPress in the Docker environment
    • Setting up a domain name and an A record on HE’s free DNS servers

    OCI (Oracle Cloud Infrastructure) uses Cloud Accounts, which is an ID for the cloud account and you can have multiple user IDs with their own email addresses and privilege levels associated to the Cloud Account. CAs are tied to a geographical region, in which you will be able to create the free-tier VMs. You will NOT be able to change this later (as of 2023).

    Next, go to the compute instances page and click “Create instance”. On the next page, enter a name and click “edit” under the “Image and shape” section. Then change the shape to an “VM.Standard.A1.Flex” instance as shown below.

    You can add more cores and memory, but I found that WordPress is quite happy with a single core and 6GB of RAM, so that’s what I used. Under the SSH keys section, you’ll have to save the private key, as that’s what you’ll use to SSH into the machine for setup. There are no passwords.

    Under the “Boot volume” section, you can specify a custom size, the default is 46.4GB. In the free tier, you get a total of 400GB space across all of your boot volumes and backups. You can split that up however you you want, but I don’t think you would need more than the default, unless your site is pretty big or you’re planning to run other containers alongside WP in the same VM. Once you’re happy with the settings, click “Create”.

    Now that we have a VM with Oracle Linux 8, we need to install Docker, Docker Compose and a test container.

    The command blocks below are already prefixed with sudo for convenience. Exercise caution when copying and executing them in a live environment.

    Then let’s begin. You’ve got the ssh .key file from when you created the VM and its public IP. To connect to it, use:

    ssh -i <path_of_.ssh_file> opc@<public_ip>

    You should now have a command prompt inside the VM. To update the OS, install Docker and Docker Compose, execute the below:

    # update the packages
    sudo dnf update -y
    # add the Docker repo to the package manager
    sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
    # install Docker
    sudo dnf install docker-ce -y
    # enable and start the service
    sudo systemctl enable --now docker
    # install the Docker Compose plugin, if necessary
    sudo dnf install docker-compose-plugin
    # enjoy

    Now we should have everything up and running. To check the configuration, we can use sudo docker info and run a test container with sudo docker run hello-world. With this, Docker is ready to use.

    If you don’t want to to use the sudo prefix for everything docker-related, add the ocp user to the docker group.

    # create docker group, although it should already exist
    sudo groupadd docker
    # add the opc user to the group
    sudo usermod -aG docker opc
    # re-evaluate group memberships without logging out
    newgrp docker

    Let’s talk about Docker Compose. To avoid possible confusion, I would first like to mention that there are two variants of compose. There is Compose Standalone, which is used with the docker-compose command and there is the Compose plugin, which is a plugin for Docker. This requires the docker compose command. Note that the dash has been replaced with whitespace. This is what we’re using and we’ve already installed it in the code block above.

    To install WordPress, we need to create a folder and inside that we need to add a file named docker-compose.yaml with the contents below:

    version: "3.9"
        
    services:
      db:
        image: mysql:latest
        volumes:
          - db_data:/var/lib/mysql
        restart: unless-stopped
        environment:
          MYSQL_ROOT_PASSWORD: somewordpress
          MYSQL_DATABASE: wordpress
          MYSQL_USER: wordpress
          MYSQL_PASSWORD: wordpress
        
      wordpress:
        depends_on:
          - db
        image: wordpress:latest
        volumes:
          - wordpress_data:/var/www/html
        ports:
          - "80:80"
        restart: unless-stopped
        environment:
          WORDPRESS_DB_HOST: db
          WORDPRESS_DB_USER: wordpress
          WORDPRESS_DB_PASSWORD: wordpress
          WORDPRESS_DB_NAME: wordpress
    volumes:
      db_data: {}
      wordpress_data: {}

    This contains all the data that Docker requires to create an environment for WordPress. If you’re curious about what the commands mean and how to tweak them, you can find a guide on wordpress.org.

    Execute the docker compose up -d command in the folder containing the .yaml file and your stack should be up and running in a few seconds. Note that the website will not yet be accessible at this point.

    You may receive an error such as the one below when attempting to start the stack: no matching manifest for linux/arm64/v8 in the manifest list entries. In this case, you can manually download the two images with docker pull wordpress and docker pull mysql. This will pull the linux/arm64/v8 variants of the images.

    Next, we have to open the ports. Once on the VM’s firewall and once on the Virtual Cloud Network.

    The firewall is very simple to update:

    # open port 80 for TCP
    sudo firewall-cmd --permanent --zone=public --add-port=80/tcp
    # reload firewall settings
    sudo firewall-cmd --reload

    Next we need to set up something similar to a port forwarding rule on OCI. There is a subnet connected to the VNIC of our Compute Instance, which has a Security List that behaves very much like a firewall. We need to add an extra rule to this to allow inbound connections on port 80.

    Click on the subnet name in the details page of the Compute Instance.

    Then you can select the default Security List.

    And finally add an Ingress Rule.

    We should allow incoming traffic to port 80 from all source IPs. This will look like the below:

    Once this is saved, you can go to the public IP displayed on the Compute Instance’s details page and you should be greeted with your WordPress’s setup page.

    The next stage depends on how you wish to set things up. As ORACLE provides a fixed IPv4 address, Dynamic DNS is not a requirement here. You can simply buy a domain name and add an A record, pointing to your site’s public IPv4 and be done with it. That’s what I’m going to do here.

    I purchased the domain from GoDaddy. There website is very simple, so I won’t explain the process here. You can use them as the DNS / name server, but I wanted to delegate that to Hurricane Electric. There, you’ll have an option to override the nameservers. The reason why I chose to go with HE instead is because they support Dynamic DNS and while this domain won’t be using it for the time being, I may set up others there that will. If you’re happy with using them, you won’t really need HE, as you can directly add an A-record on their website. However, if you do want to change them, carry on reading below.

    Go and register on Hurricane Electric Hosted DNS (he.net). Once that’s done, log in and choose “Add a new domain”. Simply enter the domain name you chose and click “Add Domain!”. You’ll be greeted with the below, telling you that the delegation was not found, so let’s fix it.

    Go back to GoDaddy and click “DNS” on the right of your new domain name, which should take you to the DNS Management page. In the Nameservers section, click on “Change”, then choose “Enter my own nameservers (advanced)” in the popup. Enter the five nameservers below:

    ns1.he.net
    ns2.he.net
    ns3.he.net
    ns4.he.net
    ns5.he.net

    Save it and grab a tea. You’ll probably need to wait until the next day to proceed, as delegating a domain from one nameserver to another may take up to 24 hours. You’re also done on GoDaddy, the rest of the setup will need to be done on HE.

    So, 24 hours later, you can go back to HE and click the “Edit Zone” icon just on the left of the domain name you added. Here, you need to create a new A Record, so click “New A” and fill in the popup as shown below:

    Once that’s done, you should be able to reach your website using the URL. You may need to refresh the DNS resolver if you’re using pfSense or any other local DNS cache.

    Tweaks

    I usually like to increase the file size limits, which can be done by adding a php.ini file to the root folder of the docker volume that WordPress is using. To get the path, use the commands below:

    # get the location of the docker volume
    docker volume inspect wordpress_wordpress_data
    # this will give us the "Mountpoint" on the local filesystem
    # we need to be root to enter this folder
    sudo su
    # navigate to the folder containing the docker volume
    cd <mountpoint> <going to be something in /var/lib/docker/volumes...>

    Now that we’re in the root folder of the wordpress_data docker volume, we need to create a php.ini file there with the below contents. And likely restart WordPress after we save it using the docker compose restart command from the folder containing the docker-compose.yaml file.

    memory_limit = 256M
    post_max_size = 128M
    upload_max_filesize = 512M

    The above changed the limits set by the hosting provider, which is us in this case. To change the limits set by WordPress, you will need to install a plugin called “Wp Maximum Upload File Size” by CodePopular. That will let you increase the limits within WordPress.

    If you’d like to transfer the contents of an existing WP site, you can use the “All-in-One WP Migration” plugin, version 6.77. The version is important, as newer versions may have issues when restoring backups. There is a guide on how to use it here. Install it on both instances, back up the old one and restore it on the new one. That’s it.

    By now you should have a working WordPress installation, so enjoy! This very instance has been set up using this guide, so you’re most likely getting this page from ORACLE’s datacenters in London.

  • Daisy-chains with USB

    Yes, that’s a term that’s usually associated with Thunderbolt or Firewire. It’s also possible with USB4, but that’s not the main focus of this article. I have a few USB hubs on hand and I would like to see if I can chain them up, and if so, how that will affect the performance.

    I’ve always had the impression that USB hubs would somewhat degrade performance because the signal needs to pass through an additional device which would somewhat degrade it. But is that actually the case? I mean, it’s not audio, it’s a digital bitstream.

    A hub allows you to use multiple USB devices and it connects to a single uplink port on your PC. That naturally means that the bandwidth of that uplink port will be shared between the four downlink ports. Not necessarily equally because one device can, in theory, use up all of the available bandwidth if no other devices need it.

    So as I said, I’ve got my hands on a pile of USB3 (USB3.0, to be exact, but that’s almost the same as USB3.1 Gen1 or USB3.2 Gen1) 5Gbps hub. They are Startech ST4300USBM devices, with optional external power inputs, which I will use, as a single port probably wouldn’t be able to provide several of them, let alone the devices I plug into the hubs.

    The hubs

    They use the VLI VL811-Q8P chipset (or one of them does, I didn’t open all 7) which is a chipset that operates on both USB2.0 and 3.0. That’s important because it’s possible to make a USB3.0-only hub that isn’t backwards compatible with USB2.0. The chipset is rated at 1W of power dissipation and there is probably some more conversion loss elsewhere. The hubs themselves are quite nice, they can deliver up to 2.4A on a port with a combined maximum of 20W. They also accept 7-24V on both inputs, the barrel connector is wired up the exact same way as the screw terminal.

    The internals. Note the dual power regulators on the sides.

    Update: According to Startech’s website, the hub uses the VL817 chipset, so there may be a number of different revisions out there. The VL817 is actually a USB3.1 Gen1 chipset, though it’s still 5Gbps.

    I’m not really interested in using multiple devices for testing, so I’ll use a single USB flash drive. A very fast one. It’s a 256GB PCIe-based M.2 SSD in a USB3.1 Gen2 enclosure, which can do 10Gbps in a compatible host. The other interesting factor could be delay, but I don’t have any equipment to measure delay on the millisecond scale. Furthermore, these hubs only have the VLI chip itself, there is no added RAM (unless it’s in the same package) that could act as a cache and thus add latency to the chain between the up- and downlinks.

    The methodology is quite simple, I will be running Crystal Disk Mark and recording the sequential read and write performance. The hubs will be chained up one after each other and that’s it.

    Just like that, although I added one more later on.

    The base speeds of the drive are 453/449MBps on the same port that I’ll connect the hubs to. On a 10Gbps port, this drive can achieve 980/978Mbps, but since the hubs can only do 5Gbps, I’ll be testing them on a native 5Gbps port. It’s actually a port from an Intel C422 PCH.

    So here comes the interesting part

    The speeds with one hub are 451/446MBps. Not bad, pretty much the same as before.

    Two hubs chained up give us 449/451MBps. Okay.

    With three hubs, we’re at 448/452MBps. Still no change.

    Four hubs: 441/453MBps. Disappointing.

    And finally, with a tower of five hubs it’s still as fast as ever. 441/452MBps.

    And I ran out of USB cables, so this is as far as I went. Looking at the above, it probably wouldn’t have been much different with 7 hubs. At least not because of speed degradation.

    USB carries a limit of 127 ports per controller. That means you can only ever have a maximum of 127 USB ports connected to a single USB controller. However, there is another limit in the number of tiers you can have and it’s 7, including the root hub (the USB controller itself) and the device. This means that there can only be 5 USB hubs in-between, so I’ve actually managed to exhaust that limit with the above test and adding a 6th hub would have likely crashed things.

    Well, that’s it, I hope you enjoyed the read and aren’t too disappointed with the results. It was quite impressive to see that there is no performance loss at all.

  • Steamdeck SSD upgrade

    This is meant to be a quick guide on how to upgrade the SSD of a Steamdeck to make more space for games. What you’ll need is an M.2 2230 SSD, a PH1 screwdriver, some plastic cards or prying tools to open it up and a box cutter knife. Oh, and a USB drive with a Type-C adapter and the SteamOS image from Valve’s website. You’ll also want to make sure the deck is fully charged before the operation, as you may not be able to connect the USB drive and the charger at the same time for the reinstallation, unless you have the necessary adapters at hand.

    The plastic cards I used were included with screen protectors I purchased in the past, they are about 1/3 as thick as a credit card and very flexible. I found these to be the best tool and I managed to open up the device without a single scratch.

    Methodology

    First, get everything ready and I suggest you read through the steps before starting to disassemble the deck. It’s also a good idea to find something soft, like a cloth mousepad that you can lie the deck down, the screen facing downwards, without scratching it. Do not exert too much downward force on it though, because that may damage the joysticks. Also, REMOVE THE MICRO SD CARD if you have one installed.

    There are 8 screws at the back, 4 short ones in the middle and 4 longer ones further out. Remove all of them.

    The next step is to unclip the plastic tabs that hold the back panel in place. This is where you’ll need the plastic cards. The thinner the better. Mine were about ⅓ of a mm. These seem to be ideal and they don’t leave any marks. Regular plastic cards may also work, but they could leave some marks on the plastic when you use them. The first side to unclip is the top with the grill. Slide in the card at one of the sides about 2-3mm deep and start pulling it across gently. If you’re doing it well, you’ll hear audible clicks when the tabs are released. Once you got all the way across, it should look something like this.

    The next are the two sides. Slide in the card at the edge and pull it downwards. This may need a tiny bit more force. Be careful not to slide it across the plastic if it pulls out, as that could leave scratches.

    Using another card to help can make things easier.

    Once both sides are unclipped, you can fold the back panel up, lifting it at the top edge and gently pulling it outwards.

    The back panel should now be released. Let’s have a look inside.

    You’ll need to remove the three screws indicated by the circles. One is underneath a metallic sticker, which you should carefully peel back just enough to access the screw underneath. You can use the blade of a box cutter knife, or anything that’s sharp enough. Be careful not to tear it. I suppose it’s meant to create an air channel inside the metallic shield by separating it from the rest of the casing.

    Next is to remove the old drive. In my case it was a 64gb eMMC drive.

    Once it’s done, we’ll have to slide off the shield and wrap it around the new drive. This 64gb drive was significantly thinner than the 512gb SK Hynix BC711 that I replaced it with, so I had to unfold it and wrap it around the new drive.

    Once that’s done, it can go straight back in the place of the old drive. And from here onwards, just go through the steps in reverse. Once it’s been assembled, follow the instructions on Steam’s website on how to install SteamOS from scratch. You’ll want to re-image the deck. If the new SSD has any partitions on it, it may cause the installer to fail, in which case you can use the partition manager to delete them before attempting the installation again. After rebooting, it may get stuck on static screens and logos for 10+ minutes, but that’s normal. Just make sure it’s plugged in.

    If everything went well, you should have a deck with more space for your games library.

  • Setting up VMs for gaming on a dual-CPU machine.

    Far Cry 4 was being given away for free a couple days ago on Uplay Ubisoft Connect and as a fan of the series, I got a copy for my girlfriend so that we could play together in co-op. It’s fun and it’s definitely worth a try if you like to play FPS games.

    The setup we use for playing games is quite unique. We have a computer with two 12-core CPUs and two dedicated GPUs. This runs two VMs with their own GPU and USB controller assigned to them using PCIe passthrough. Theoretically, this should be more than enough for an 8 year old game to run smoothly, but when I started the game, it turned out to be a really bad experience. It wasn’t the usual FPS lag. The game was running at a stable 60+ FPS, but whenever I attempted to move the character or pan the camera around, the frame rate immediately plummeted to the single digit range. I’ve seen a similar issue when trying to play Control on the same VM, but I thought it was simply a matter of an underpowered CPU or GPU, as I’ve never tried running that game on bare metal.

    So, what’s happening? As a sanity check, I decided to do a quick bare metal Windows installation to see how the game runs there. It was perfectly fine, which means that the issue originates from the virtualization layer. Let’s have a look at the physical machine running the hypervisor:

    The block diagram of the PC running the hypervisor, courtesy of ASUS

    The block diagram shows the two CPUs, which are Xeon E5-2696v2s, the four memory channels connected to each one of them and the two QPI links between. This means there are two NUMA nodes, which stands for Non-Unified Memory Access. CPU1 can access the contents of its own half of the memory directly, but can only access the contents of the other half by going to CPU2 through the QPI links. QPI links are not as fast as the internal bus of the CPU, and as such, these maneuvers add a significant amount of latency. If both CPUs work on the same dataset which is stored in either of the CPUs memory, then the other CPU would experience a significant overhead when accessing it. H.265 encoding is one of these situations, as I routinely see the secondary CPU only running at around 60-70% while the primary is at 100%, likely due to this added overhead.

    This can also affect the performance a VM running on the computer. By default, the cores to a VM are allocated in a nondeterministic way. This means they may be spread across the two CPUs, which would introduce the scenario above. However, this can be further amplified if the VM has PCIe devices attached to it, as those devices are wired to one of the CPUs and if the thread that is generating the load for the GPU is running on CPU1, while the GPU is attached to CPU2, then all the instructions will first need to pass the QPI link.

    To add a little meaning to the above diagram, here are the speeds of the buses which are included:

    • The QPI links are running at 8GT/s each, which is equal to 32GB/s. This is actually 16GB/s in each direction. As there are two of them, that makes it 32GB/s bidirectional.
    • The memory is running at 1333Mhz, which is 10.6GB/s per channel. For a quad channel config, that’s 10.6*4 = 42.6GB/s
    • The Quadro K5000 and Vega 56 are connected at 8GB/s and 16GB/s respectively. One lane of PCIe 2.0 can do 5GT/s, which is 500MB/s and a 3.0 lane can do 8GT/s, which is 0.98GB/s.
    • The chipset is connected at 2GB/s using a DMI 2.0 link to the primary CPU. It’s really is just a fancy name for an x4 PCIe 2.0 link.

    You can find out a lot more about these interconnects on Wikipedia.

    Let’s get back to the issue. The extra latency introduced by the QPI link. The solution is to remove it from the equation. Ensure that the VMs won’t need to send any data over it. That can be achieved by restricting the pool of CPU cores that the VMs are allowed to use. Or create a direct mapping between physical and virtual cores. The latter is called CPU pinning and is exactly what I’ll set up.

    The issue surfaced on Proxmox and I decided to move to a different hypervisor. I tested XCP-ng, which worked really well out of the box, but I wanted a little more flexibility, so in the end I settled on Fedora Server 36. It has a nice GUI to control the VMs, but we’ll need some more in-depth settings, which can be added using the virsh command line tool.

    There are a couple of things we need to do:

    • Isolate the CPU cores from the host
    • Pin these cores to the VM
    • Remove Hyper-V optimizations
    • Set the CPU type to host-passthrough

    The first can be done by adding the isolcpus parameter to the GRUB_CMDLINE_LINUX line of the /etc/default/grub file, then updating the grub configuration using grub2-mkconfig.

    GRUB_CMDLINE_LINUX=" ...isolcpus=0,1,2,3,4,5,6,7... "

    Of course, only isolate the cores you have and leave a few for the host. Linux won’t schedule any threads on the listed cores. There are other parameters required to configure the PCIe passthrough, but this is about optimizations and not the passthrough itself.

    Next, we have to edit the VMs. I’ve noticed that using Hyper-Threading makes things noticeably worse, so I’m only using one thread per core. The way they are numbered depends on the OS, the lscpu command can help in figuring this out. In my case, cores 0-11 and 24-35 belonged to CPU1 and 12-23 and 36-47 belonged to CPU2. The threads were set up in a way that core 0-24, 1-25, 2-26 etc. belonged to the same physical core on the CPU. The cputune section will tell the hypervisor to pin 10 cores of CPU2 to a VM with the rest of the cores used as emulatorpins. The numatune section ensures that all the RAM the VM gets allocated stays within CPU2’s node. (They use 0-based indexing, so in reality, there is CPU0 and CPU1) The values used in the cpuset values and the corresponding HT threads should be used in the isolcpus parameter of grub, as described in the previous section.

    <vcpu placement='static'>10</vcpu>
    <cputune>
            <vcpupin vcpu='0' cpuset='14'/>
            <vcpupin vcpu='1' cpuset='15'/>
            <vcpupin vcpu='2' cpuset='16'/>
            <vcpupin vcpu='3' cpuset='17'/>
            <vcpupin vcpu='4' cpuset='18'/>
            <vcpupin vcpu='5' cpuset='19'/>
            <vcpupin vcpu='6' cpuset='20'/>
            <vcpupin vcpu='7' cpuset='21'/>
            <vcpupin vcpu='8' cpuset='22'/>
            <vcpupin vcpu='9' cpuset='23'/>
            <emulatorpin cpuset='12-13'/>
    </cputune>
    <numatune>
            <memory mode='strict' nodeset='1'/>
    </numatune>

    The next step is to remove the Hyper-V optimizations that are added by default. This is quite simple, just remove the entire <hyperv> section from the VM’s config, which usually looks like the below:

    <hyperv>
       <relaxed state='on'/>
       <vapic state='on'/>
       <spinlocks state='on' retries='8191'/>
    <hyperv/>

    The last step is to set the CPU type to host-passthrough or if that’s not available, choose the option that’s closest to the host CPU’s architecture. This tells the VM what instruction sets and extensions are available on the CPU and can help speed things up.

    You can also disable HT altogether in the BIOS, but I have a few more VMs which can take advantage from the extra threads and aren’t used for gaming, so I decided to leave them in place. As you have probably noticed, I isolated both threads of 10 cores from each CPU, but have only assigned one thread from each core to the gaming VM. This prevents the host OS from using either of the threads, otherwise the host may use the non-isolated threads which could affect the performance of the other thread on that core.

    I hope someone will find the above useful, I spent a few days experimenting with the different settings and found that this was adequate to ensure that the games were playable. However, it’s likely that there are more tweaks out there which can further optimize the performance, so if you have some ideas, do leave them in the comments.

  • Aruba AP to IAP conversion

    This is just a small snippet on how to convert a controller-based Aruba AP to a standalone IAP. This writeup is for an AP-274, but it should work similarly on most ArubaOS-based access points.

    You’ll need:

    • The access point
    • Console access (a serial or micro-USB cable, depending on the AP)
    • A valid firmware file from Aruba
    • A TFTP and a DHCP server

    We’re going to interrupt the boot process using the serial console and overwrite the image with the one that’s hosted on our TFTP server.

    Before we touch the AP, we need to set up the TFTP server. I usually like to rename the file to something that’s simple to type in and just place it in the root folder of the TFTP server. The AP will look for the file on a computer with a hostname of aruba-master, although this can be overridden, but I’ll use it in this writeup. So, set the hostname of the computer that’s going to host the TFTP server and start the server with the file in place.

    Next, we need to access the bootloader, apboot. Follow the below instructions:

    1. Connect the serial cable and open a serial console. The settings are usually 9600 baud, 8 data bits, 1 stop bit, no parity, no flow control, but this may vary based on what you’re flashing.
    2. Once the console is open, power on the AP.
    3. Interrupt the boot process by pressing ENTER when prompted, you’ll only have 2 seconds to do this
    4. If it didn’t work, leave the serial cable in place and reboot the AP

    Once you see the apboot> prompt, you have to verify the DHCP settings. This is to ensure that the network is properly set up. Just use the dhcp command. help lists all the available commands, and help <command> gives you some brief instructions on how to use them.

    The output of dhcp should look something like this:

    eth0: link up, speed 1 Gb/s, full duplex
    DHCP broadcast 1
    DHCP IP address: 172.16.34.170
    DHCP subnet mask: 255.255.0.0
    DHCP def gateway: 172.16.0.1
    DHCP DNS server: 172.16.0.1
    DHCP DNS domain: local.domain
    

    And printenv should look like this:

    bootdelay=2
    baudrate=9600
    autoload=n
    boardname=K2
    servername=aruba-master
    bootcmd=boot ap
    autostart=yes
    bootfile=e500.ari
    ethaddr=ac:a3:1e:<redacted>
    stdin=serial
    stdout=serial
    stderr=serial
    ethact=eth0
    gatewayip=172.16.0.1
    netmask=255.255.0.0
    ipaddr=172.16.34.170
    dnsip=172.16.0.1
    
    Environment size: 280/131068 bytes
    

    In the above, the servername parameter is the hostname where it’ll look for the TFTP server. You can change it using setenv servername <new_hostname>. Using the serverip environment variable should also work.

    So far, we haven’t changed anything, but from here on we will, so here’s the usual disclaimer:

    ONLY PROCEED AT YOUR OWN RISK. There is a chance that you can brick the device, for which I will not be responsible.

    Now we have to erase the boot image and flash the replacement from the TFTP server. Some APs have multiple partitions, which means we have to erase both.

    Using osinfo you can determine whether that’s the case or not:

    Partition 0 does not contain a valid OS image
    
    Partition 1:
        image type: 0
      machine type: 25
              size: 6659784
           version: 6.4.0.0
      build string: ArubaOS version 6.4.0.0 for 22x (p4build@cyprus) (gcc version 4.5.1) #41487 SMP Sun Dec 29 18:41:34 PST 2013
             flags: preserve factory
    
    Image is signed; verifying checksum... passed
    Signer Cert OK
    Policy Cert OK
    RSA signature verified.

    My AP does have two partitions, so to clear both, we have to run:

    apboot> clear os 0
    Erasing flash sector @ 0xee000000.... done
    Erased 1 sectors
    apboot> clear os 1
    Erasing flash sector @ 0xee000000.... done
    Erased 1 sectors
    

    Once that’s done, the expected output of osinfo is:

    Partition 0 does not contain a valid OS image
    
    Partition 1 does not contain a valid OS image
    

    Next, we have to flash the new image with the upgrade os <filename> command, where the filename is the name of the firmware file on the TFTP server. If everything is set up correctly, this will pull the file from the TFTP server and flash it to partition 0. It’ll take a few minutes to complete.

    If it’s successful, you can simply type in boot and it’ll boot the newly flashed OS. Do not close the serial session yet. The first boot will take a while, as it needs to initialize a few things.

    Wait until you see something like this in the serial console:

     <<<<<       Welcome to the Access Point     >>>>>

    You can navigate to the IP address of the AP in a browser and once you’re at the login page, you can try logging in with the admin admin combo. It likely won’t work. You need to go back to the serial console and press enter to get a login prompt. Try to log in with the same credentials here. This should work and once you’re in, you can try again on the web interface.

    Similarly, if you attempt to log in on the serial console before opening the web interface, it won’t let you in, so you need to try the website at least once before the CLI.

    Then, once you’re in, you’ll be prompted for the country code, which can be changed later using the serial CLI, so it’s not permanent like with Xirrus or Cambium APs. Though even on those APs it can be changed, it’s just a little complicated.

  • The build of a Linux-based music streamer

    I like music. But every time I want to listen to something it feels like too much of a hassle to fire up a computer and I often think – nah, it doesn’t worth turning on a pc for 10 minutes. So, I decided to have a peek at what sort of streaming boxes were available and I got very discouraged when I realized that what I was looking for would likely cost something in the region of £500. So, naturally, I started looking for alternatives and decided to build something myself.

    Most of my music collection is on a NAS, from .mp3 to .dsd files. The latter are a bit peculiar to handle, so being able to play those wasn’t a requirement, but I did want to have the ability to play .flac files. And my girlfriend loves Spotify, so it had to be able to stream music from there too.

    The other requirement was a good quality DAC. This was one of the reasons why I didn’t just get a Chromecast Audio (apart from it having been discontinued a while ago) and was looking at the more Hi-Fi oriented products.

    So, after rummaging through some boxes of random computer bits, I found a low power ECS Mini PC and an Asus SupremeFX Hi-Fi USB DAC, which form the base of this project. The PC has a dual-core Atom CPU with 1GB of RAM and 4GB of eMMC storage and the USB DAC is a really decent one which fits into a 5.25″ bay, but as my current PC case doesn’t have one, it was collecting dust. I needed to power these so I’ve purchased a 240V to 12V switching mode PSU and a 12V to 5V DC-DC converter, as the DAC needs 12V and the PC runs on 5V.

    The main components

    The additional yellow part with the USB connector is a USB isolator which galvanically separates the device from the USB host to reduce hissing and filter out other noises from the signal. I also ended up using a different DC-DC converter from the one shown here, as this one didn’t want to play well with the 12V PSU. The new one is from Traco Power, the same brand as the 240V power supply. They are a TXM 035-112, capable of delivering 35W at 12V and a TEN 10-1211, which can output 10W at 5V for the PC.

    The new DC-DC converter

    Of course, it wouldn’t be complete without an enclosure, so I had to find something that looks nice and can be customized to fit all of these components. I stumbled upon these aluminium cases on Aliexpress that were made for audio amplifiers and I thought I’d give them a go. My choice was a T-2205, which comfortably fits all of the components in the above picture. I mounted everything using brass standoffs, so all I needed to do was to drill some holes in the bottom panel, which was quite simple, as the case arrived in pieces, much like Ikea furniture.

    The bottom plate
    The internal layout

    I added a wifi card to make positioning more flexible and the RCA outputs which I mounted on the case are just soldered to 3.5mm audio jacks, which connect to the sound card. One of those is a line input which I haven’t utilized here. A rotary encoder was also installed on the front panel, as the sound card’s volume cannot be controlled by software. The DC-DC converter is not visible here, as it’s underneath the sound card.

    I found it quite easy to drill the aluminium and marking the positions of the holes was done by using cellotape and a sharpie. The front panel was 5mm thick and the hole I needed for the rotary encoder was around 9mm in diameter, so I used multiple drill bits, starting at 1mm and stepping up 0.5mm at a time and also found that regular cooking oil makes a very good lubricant for drilling.

    A different angle

    The rotary encoder was soldered to the contacts of the original one on the sound card and the cable was secured with some hot glue to keep it in place. The new rotary encoder I used is an Alps EC20A1820401 and it has an extra protruding pin in the front to prevent it from turning when the knob is rotated. I drilled a 2mm hole for that which was only around 2-3mm deep, so it didn’t go all the way through the panel.

    The hardware is mostly finished at this point, so the next stage is to install an OS and set up a music player app that can be controlled remotely.

    Pages: 1 2