All posts by werner

A replacement for the JavaFx FontMetrics class for JDK9

Most people used an internal javafx FontMetrics class, which has been deprecated in version 9 of the jdk. That means that your app that relied on this will simply not work anymore. Below is a simple replacement that will provide the computeStringWidth as well as ascent, descent and lineHeight. The produiced values are exactly the same as if they were called from the FontMetrics class itselve.

import javafx.geometry.Bounds;
import javafx.scene.text.Font;
import javafx.scene.text.Text;
public class FontMetrics
{
 final private Text internal;
 public float ascent, descent, lineHeight;
 public FontMetrics(Font fnt)
 {
 internal =new Text();
 internal.setFont(fnt);
 Bounds b= internal.getLayoutBounds();
 lineHeight= (float) b.getHeight();
 ascent= (float) -b.getMinY();
 descent=(float) b.getMaxY();
 }

 public float computeStringWidth(String txt)
 {
 internal.setText(txt);
 return (float) internal.getLayoutBounds().getWidth();
 }
}

Adadelta lovely optimizer

I just tested adadelta on the superresolution example that is part of the pytorch examples. The results are quiet nice and I like the fact that LeCun’s intuition to use the hessian estimate actually got implemented in an optimizer (I tried doing it myself but couldn’t get through the notation in the original paper).

Interesting is that the ‘learning-rate of 1’ will scatter throughout the entire space a bit more than what you would expect. Eventually it does not reach the same minima as a learning rate of 0.1.

In the above example we also delineated each epoch every 20 steps. That means, when step%20==0 we cleared all the gradients. Which feelsd a bit odd that we have to do so. In any case, without the delineation into epochs the results are not that good. I do not entirely understand why. It is clear that each epoch allows the optimizer to explore a ‘new’ direction by forgetting the garbage trail it was on, and in a certain way it regularizes how far each epoch can walk away from its original position. Yet _why_ the optimizer does not decide for itself that it might be time to ditch the gradients is something I find interesting.

Orthogonal weight initialization in PyTorch seems kinda weird

I recently gave deep learning another go. This time I looked into pytorch. At least the thing lets you program in a synchronous fashion. One of the examples however did not work as expected.

I was looking into the superresolution example (https://github.com/pytorch/examples) and printed out the weights of the second convolution layer. It turned out these were ‘kinda weird’ (similar to attached picture). So I looked into them and found that the orthogonal weight initialization that was used would not initialize a large section of the weights of a 4 dimensional matrix. Yes, I know that the documentation stated that ‘dimensions beyond 2’ are flattened. Does not mean though that the values of a large portion of the matrix should be empty.

weights_grmpf

The orthogonal initialisation seems to have become a standard (for good reason. See the paper https://arxiv.org/pdf/1312.6120.pdf), yet is one that does not work together well with convolution layers, where a simple input->output matrix is not stratight away available. Better is to use the xavier_uniform initialisation. That is, in the file model.py you should have an initialize_weights as follows:

 def _initialize_weights(self):
   init.xavier_uniform(self.conv1.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv2.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv3.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv4.weight)

With this, I trained a model on the BSDS300 dataset (for 256 epochs) and then tried to upsample a small  image by a factor 2. The upper image is the small image (upsampled using a bicubic filter). The bottom one is the small picture upsampled using the neural net.

The weights we now get at least use the full matrix.

The output when initialized with “orthogonal” weights has some sharp ugly edges:

A better crossfade

A demonstration on the difference between two crossfades. The first is a straightforward crossfade. The second one is a crossfade in which the partials of both tracks are detected, selected (the strongest wins) and then resynthesized.

The normal crossfade:

Crossfading the partials:

To hear the difference listen to the middle of the two tracks (around 16″). While the normal crossfade sounds muddier, the second one retains the same volume and clarity as either track (at least to my ears).

A talk on timestretching at a hackerscamp

Time stretching of audio tracks can be easily done by either interpolating missing samples (slowing down the track), or by throwing away samples (speeding up the track). A drawback is that this results in a pitch change. In order to overcome these issues, we created a time stretcher that would not alter the pitch when the playback speed changed. In this talk we discuss how we created a fast, high quality time stretcher, which is now an integral part of BpmDj. We explain how a sinusoidal model is extracted from the input track, its envelope modeled and then used to synthesize a new audio track. The synthesis timestretches the envelope of all participating sines, yet retains the original pitch. The resulting time stretcher uses only a frame overlap of 4, which reduces the amount of memory access and computation compared to other techniques.

We assume the listener will have a notion about Fourier analysis. We do however approach the topic equally from an educational as well as from a research perspective.

High resolution slides are available at http://werner.yellowcouch.org/Papers/sha2017/index.html

 

JavaFx 8 command line options

I have been struggling with an “interesting” javafx render problem. In BpmDj, given enough updates, the rendertree would partially stop updating. It was clear this was a concurrency bug, yet one that was not my fault. As a preliminary solution to fix this I would grab a screenshot everytime I would add a node. Recently however that solution no longer worked and the dirty logic would be so fucked up that elements which should be visible were not visible. It was even possible to have a font change its color halfway through its rendering.

So I started to look around for javafx options, and there are a few of them that proved to be useful.

What Flag Default
VSync prism.vsync true
Dirty region optimizations prism.dirtyopts true
Occlusion Culling prism.occlusion.culling true
dirtyRegionCount prism.dirtyregioncount 15
Scrolling cache optimization prism.scrollcacheopt false
Dirty region optimizations prism.threadcheck false
Draws overlay rectangles showing where the dirty regions were prism.showdirty false
Draws overlay rectangles showing not only the dirty regions, but how many times each area within that dirty region was drawn (covered by bounds of a drawn object). prism.showoverdraw false
Prints out the render graph, annotated with dirty opts information prism.printrendergraph false
Force scene repaint on every frame prism.forcerepaint false
disable fallback to another toolkit if prism couldn’t be init-ed prism.noFallback false
Shape caching optimizations prism.cacheshapes complex
New javafx-iio image loader prism.newiio true
Verbose output prism.verbose false
Prism statistics print frequency, <=0 means “do not print” prism.printStats 0
Debug output prism.debug false
Trace output prism.trace false
Print texture allocation data prism.printallocs” false
Disable bad driver check warning prism.disableBadDriverWarning” false
Force GPU, if GPU is PS 3 capable, disable GPU qualification check. prism.forceGPU false
Skip mesh normal computation prism.experimental.skipMeshNormalComputation false
Which driver to use prism.order
prism.forcepowerof2 false
prism.noclamptozero false
Try -Dprism.maxvram=[kKmMgG] prism.allowhidpi true
prism.maxvram 512 * 1024 * 1024
Try -Dprism.targetvram=[kKmMgG]|<double(0,100)>% prism.targetvram
prism.poolstats false
prism.pooldebug false
prism.maxTextureSize
prism.minrttsize
prism.disableRegionCaching
prism.disableD3D9Ex false
prism.disableEffects false
prism.glyphCacheWidth 1024
prism.glyphCacheHeight 1024
Enable the performance logger, print on exit, print on first paint etc. sun.perflog
sun.perflog.fx.exitflush
sun.perflog.fx.firstpaintflush
sun.perflog.fx.firstpaintexit
prism.supershader true
Force uploading painter (e.g., to avoid Linux live-resize jittering) prism.forceUploadingPainter false
Force the use of fragment shader that does alpha testing (i.e. discard if alpha == 0.0) prism.forceAlphaTestShader false
Force non anti-aliasing (not smooth) shape rendering prism.forceNonAntialiasedShape false
Set Single GUI Threading quantum.singlethreaded false
Print quantum verbose quantum.verbose false
JavaFx framerate in FPS javafx.animation.pulse 60

FX thread collision with Render thread ?

When running with the prism.threadcheck option I got the following error:

ERROR: PrismPen / FX threads co-running: DIRTY: false
FX: java.lang.Thread.getStackTrace(Thread.java:1559)
FX: com.sun.javafx.tk.quantum.QuantumRenderer.checkRendererIdle(QuantumRenderer.java:247)
FX: com.sun.javafx.tk.quantum.QuantumToolkit.checkFxUserThread(QuantumToolkit.java:424)
FX: javafx.scene.Scene$MouseHandler.process(Scene.java:3680)
FX: javafx.scene.Scene$MouseHandler.access$1500(Scene.java:3485)
FX: javafx.scene.Scene$MouseHandler$1.run(Scene.java:3521)
FX: com.sun.javafx.application.PlatformImpl.lambda$null$173(PlatformImpl.java:295)
FX: java.security.AccessController.doPrivileged(Native Method)
FX: com.sun.javafx.application.PlatformImpl.lambda$runLater$174(PlatformImpl.java:294)
FX: com.sun.glass.ui.InvokeLaterDispatcher$Future.run(InvokeLaterDispatcher.java:95)
FX: com.sun.glass.ui.gtk.GtkApplication._runLoop(Native Method)
FX: com.sun.glass.ui.gtk.GtkApplication.lambda$null$49(GtkApplication.java:139)
FX: java.lang.Thread.run(Thread.java:748)
QR: com.sun.javafx.sg.prism.NGCanvas.getStroke(NGCanvas.java:777)
QR: com.sun.javafx.sg.prism.NGCanvas.setupStroke(NGCanvas.java:785)
QR: com.sun.javafx.sg.prism.NGCanvas.handleRenderOp(NGCanvas.java:1212)
QR: com.sun.javafx.sg.prism.NGCanvas.renderStream(NGCanvas.java:1097)
QR: com.sun.javafx.sg.prism.NGCanvas.renderContent(NGCanvas.java:606)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
QR: com.sun.javafx.sg.prism.NGRegion.renderContent(NGRegion.java:576)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
QR: com.sun.javafx.sg.prism.NGRegion.renderContent(NGRegion.java:576)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
QR: com.sun.javafx.sg.prism.NGRegion.renderContent(NGRegion.java:576)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
QR: com.sun.javafx.sg.prism.NGRegion.renderContent(NGRegion.java:576)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.sg.prism.NGGroup.renderContent(NGGroup.java:235)
QR: com.sun.javafx.sg.prism.NGRegion.renderContent(NGRegion.java:576)
QR: com.sun.javafx.sg.prism.NGNode.doRender(NGNode.java:2053)
QR: com.sun.javafx.sg.prism.NGNode.render(NGNode.java:1945)
QR: com.sun.javafx.tk.quantum.ViewPainter.doPaint(ViewPainter.java:477)
QR: com.sun.javafx.tk.quantum.ViewPainter.paintImpl(ViewPainter.java:330)
QR: com.sun.javafx.tk.quantum.PresentingPainter.run(PresentingPainter.java:91)
QR: java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
QR: java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
QR: com.sun.javafx.tk.RenderJob.run(RenderJob.java:58)
QR: java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
QR: java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
QR: com.sun.javafx.tk.quantum.QuantumRenderer$PipelineRunnable.run(QuantumRenderer.java:125)
QR: java.lang.Thread.run(Thread.java:748)

Of course how that related to my problem wasn’t clear, allthough maybe it could explain why the dirty flags were all ****ed up.

Overdraw and dirty draws

overdraw

The option -Dprism.showoverdraw=true is interesting because it shows us which area we drew once (red), which twice (green), and so on. This showed me that the areas that didn’t update were not even marked as ‘dirty’. Suggesting that the problem was indeed with the setting and tracking of the dirty flag in the renderer. Therefore we tried the option -Dprism.dirtyopts=false and hooray all problems disappeared. Of course the app ran slightly slower, but at least it worked now.

Java ReentrantReadWriteLock oddities

When I was hit with startvation of some of the update threads in BpmDj, I was a bit puzzled. After all, I did use a ReentrantReadWriteLock in fair mode. A simple profiling showed that certain transaction where substantially more heavy (a writelock taking lets’ say 10 seconds), while the databasereads would merely take 1 second.

From that I concluded that because the writelock was held longer, other threads did not have the opportunity to have a fair amount of locktime themselves. E.g: the write lock is released, the longest waiting readlock is granted: that transaction is done within a second, and the writelock is granted again. And is not released for another 10 seconds.

To test this I set up a program to create 10 reader threads and 1 writer thread. Each thread would acquire a lock, wait some time (to simulate the ‘work’ done in the locked section) and then release the lock. This would be performed in a loop for about 10 seconds. Afterwards, we could measure how much time within the lock was spend for each thread and compare that with the amount of work the thread wanted to perform.

These were the results:

Unfair Lock

The unfair lock behaved, as expected, fairly unfair. If the writer had 10 times less work than the readerthreads, its locktime would be 8 times higher. If the writer had the same amount of work, it would have 16 times more locktime and if the writer had 10 times more work, then it would be granted 50 times more locktime.

Thus:
/10.0 => *7.944541604031417
1.0 => *15.917042652441687
*10.0 => *50.5366207048361

A fair lock

When we created the read/write lock in a fair fashion, the results were more in line with what we would expect:

/10.0 => /8.810361366979999
1.0 => /1.0021791947397343
*10.0 => *9.009623837637745

That is, when the worker has 10 times less work than the readers, it has 8 times less locktime. If it has the same amount of work, it receives the same amount of locktime and if it has 10 times more work, then it is granted 9 times more locktime.

This is completely as expected, yet not something we might want, because it allows heavy tasks to block the lighter tasks.

A fair lock, prefixed with tryLock()

tryLock allows an app to check for a lock, and if it is not granted the lock to continue with something else. There are two tryLock variants. The first without parameters (tryLock()), the second with a timeout.

Trylock() screws up any scheduling that might have been in place and just barges in on anything the algorithm might be planning to do.

/10.0 => /925.0224470413133
1.0 => /99.7994977890176
*10.0 => /11.036514077119893

In this scenario, the writer thread pretty much does not get anything done. Whether it is performing 10 times less or 10 times more work, its locktime ranges from ~ /1000 to /10. This is very bad, because you might expect the tryLock to make the lock unfair, yet the results of an unfair lock (see above) are completely the opposite.

A fair lock, prefixed with tryLock(timeout)

There is a second variant of tryLock: one with a timeout; which can indeed be 0. If we apply that, we get the following results:
/10.0 /8.949334569534225
1.0 /1.0091078687281538
10.0 *9.016406673440223

which is in line with the straightforward fair lock.

In BpmDj, we used the tryLock() instead of tryLock(0), assuming that a timeOut of 0 would result in the same behavior between them.

Sleep debt, Oxygen deprivation and Resmed

I apologize at the start of this post. I never wanted to sound as someone who wants to document his snoring. Anyway, I do so because I feel I have some important things to share. I will try to stick to some facts that might help you without going too much in my personal situation.

Since half a year or so I try to get my snoring under control. To that end I got a Resmed Airsense 10 from docters/insurers together with a nasal mask.

Nostrils

The first big obstacle was learning to sleep with my mouth closed. Not much to do but to actually do it. This took some weeks.

Then getting up to speed was problematic because my nostrils would be more closed than open during the night. That lead to painful lungs and a not so optimal ‘therapy’. Two things were necessary to resolve this

a- got rid of the airfilter in the machine. The airfilter that was installed would actually pollute the air coming in (it hadn’t been changed in at least 6 months and the provider didn’t feel in a hurry to change it)

b- started using the humidifier at position 4. Every morning I would take it out of the machine and leave it open during the day. That would allow the water to breath. Once in a while I would replace it completely.

With those two tricks I got my nostrils somewhat under control.

Lack of oxygen

Although the headaches during the days vanished immediately after starting to use the machine, I now got symptoms of someone lacking oxygen. I felt really tired in the afternoon. Talking to my doctor did not help very much. He suggested that I could not be lacking oxygen because there was a positive pressure at the inlet.

It took me a couple of months to realize that he was wrong. To understand that just imagine the mask with no breathing holes. If you exhale air you will fill up the long tube, the humidifier and the rest of the machine with used air. The next breath you take will be first the old air, then the new. Now imagine, only 1 or two holes. The machine will be able to generate the necessary pressure, still no real air exchange will take place.

To analyze this further I set up a simulation in which the patient would inhale all the air he just exhaled. Here is what happens to the oxygen level then:

reuse-same-air

The above plots shows the initial oxygen concentration at about 21%. Each breath removes 1/4th of the oxygen, leaving you after 4 breaths with only 1/3th of the necessary oxygen ! That is quiet staggering.

Of course, the resmed machines do not have closed holes. The positive pressure replaces some of the old air with new air.  The question is now: how much ? This can be expressed as a percentage of the air that is swapped’ per breath cycle. For each percentage, we can calculate the amount of oxygen (compared to normal air) that would be available to you during the night.

poison-yourself2 poison-yourself

From the above plot we can see that if the machine is able to swap out 80% of the air (during one breath-cycle),  you will have 93% of normal-air-oxygen. That is 7% less than you need.

Clearly we had found the culprit. I must have had an air exchange percentage that was sufficiently low, leading to a low oxygen availability. The question now was: what to do about it ?

Solution #1: turn of the ‘autoset’ feature of the resmed machine or increase its minimum pressure substantially – Initially, my machine was set in autoset mode, which means that it would try to determine  the best setting for your situation automatically. It will navigate itself between the minimum and maximum boundaries to minimize the number of blockages. Of course, that minimum might lower the average pressure that sends air out of the mask. Thus: it might be that, although it might minimize the apnoes, it no longer vents properly. An easy solution to solve that is to use a continuous positive pressure, or to increase the minimum pressure. That there is some truth to this can be read from online reports of people who went from a normal CPAP mask (resmed 8) to an automatic settings mask (resmed 10) and complained that they felt worse with this new machine.

How does the mathematics look ? Generally, air pressure through a hole can be modeled as the square root of the pressure difference divided by the air density. (ignoring friction and so on). We thus have sqrt(0.016666 P) describing the air speed velocity through the holes. Thus if we raise the minimum pressure by a factor x, we will push out around sqrt(x) more air.

Solution #2: turn off  Exhalation Pressure Release (The EPR setting) – online you often read that people sleep easier when the machine allows you to exhale easier (it does that by lowering the pressure when you exhale), yet in doing so, it drastically reduce the amount of air swapped out.

Solution #3: use a larger mask. Try to close the main hole of the mask and exhale through the little holes. Measure how long it takes to exhale and compare that against a normal exhalation. If you cannot generate enough pressure to get all the air out within one breath cycle, then your mask is too small. This problem can be solved by using a larger mask

And then it happened. I ditched Firefox

The latest installment of Firefox forces linux users to use Pulseaudio. Pulseaudio is an excuse for a sound pipeline.  It does not work. It is as simple as that.

BpmDj using pulseaudio skips through the audiostream because the java audiodriver apparently cannot keep track of the pulseaudio stream. You would say: yes it it probably a java problem. Not so, many applications, also non java apps, have this problem.

https://aliver.wordpress.com/2016/02/17/why-i-dislike-pulseaudio/ has some more ranting on the state of pulseaudio. A must read.

Aside from that, pulseaudio with firefox is a complete CPU hog. Doing nothing in firefox: 47% CPU usage for the Webcontent process. Throw pulseaudio off your system: a mere 9% CPU usage (which I already find too much but you can see how pulseaudio affects my fan life).

Then, there is the problem that the mozilla developers ignore the pleas of their users wrt to this ‘feature/bug’ https://bugzilla.mozilla.org/show_bug.cgi?id=1247056 is a bloody painful read. None of the developers make any sense. It seems like they have all been lobotomized. Their arguments boil down to: ‘yeah, it was easier to program. Fuck you. And BTW, fuck you again. Please, send us your telemetry if you need more help.’ They won’t be getting much telemetry from me anymore. I ditched firefox and switched to the iridium-browser. And my goodness that thing works fast and good.