All posts by werner

Deploying a .NET application

BpmDj… my brainchild… bringing in no money… So I am looking for free tools.

ClickOnce

ClickOnce is a technolog by microsoft to easily start and upgrade an application. It is remarkably similar to the packaging solution I created in Java. At start time it will check what is available and download it if the user wants so. I like this.

ClickOnce of course requires ‘code signing’ certificates, which are really difficult to make. And without shelling out money, windows  smartscreen will always copmplain when the application is installed or upgraded. Yet… I will not spend 100EUR/Year just to remove that dialog.

A solution would be to use no installer, and then secretly upgrade the application behind the users back. Nevertheless, even then I will get the ‘untrusted application’ message, so I will assume BpmDj users are smart (they are) and will probably realize that it is pointless to spent so much money on something they will accept anyway.

.NET Obfuscators

BpmDj… my brainchild… bringing in no money… So I am looking for free tools.

In order to figure out how much information assemblies throw around,  just have a look at https://www.jetbrains.com/decompiler/ It basically returns me the original source code, including all variable names and everything else that could have been thrown out. Therefore, an obfuscator is really necessary. http://www.dotnetstuffs.com/best-free-obfuscator-to-protect-net-assemblies/ had a list of interesting possibilities.

Dotfuscator (A Lead to Sell)

The microsoft site refers to ‘DotFuscator’; and let me tell you.. the community edition is bullshit. The entire thing is one big lead to sell you their product. It starts with a forced registration (you have to give a valid emailaddress). Then when you are in the application, you only see advertisment, not a lot of real useful obfuscation going on. And lastly, when I ran it on BpmDj it wasn’t even able to go through it because ofg ‘mixed assemblies’ I am sure I could set up a joint project with preemtive solutions, in which I would of course pay them, but honestly… don’t bother with this bullshit. The community edition doesn’t do what it pretends it will do.

Obfuscar (No XAML)

Obfuscar tries to map as many input names to as few output names as possible. ‘Massive overloading’ as they call it. https://code.google.com/archive/p/obfuscar/

At first glance this seems a dead end.. Last release was 11 years ago. However stack overflow posts still discuss it in 2018. Ha no, it seems to have moved to https://github.com/obfuscar/obfuscar

Amazingly enough, after getting a simply configuration it actually ran through the entrire shebang of assemblies and generated 1 output. That output could even start ! Yet it did hang at the splashscreen. Attaching a debugger showed that all threads had been started properly, so I assume either I access a dynamic resource by name (I do have some explicit invokes laying around), or the XAML bindings were seriously fucked up. This is something I should test somewhat further, because if this works we are done.

Oh the horror. The default configuration doesn’t actually obfuscate shit. All identifiers were still present, despite the fact that it claims it had a ‘mapping’. Probably it kept all public properties public as they were without renaming them.

A list of options for the Obfuscar xml file

InPath = Environment.ExpandEnvironmentVariables(vars.GetValue("InPath", "."));
 OutPath = Environment.ExpandEnvironmentVariables(vars.GetValue("OutPath", "."));
 LogFilePath = Environment.ExpandEnvironmentVariables(vars.GetValue("LogFile", ""));
 MarkedOnly = XmlConvert.ToBoolean(vars.GetValue("MarkedOnly", "false"));

RenameFields = XmlConvert.ToBoolean(vars.GetValue("RenameFields", "true"));
 RenameProperties = XmlConvert.ToBoolean(vars.GetValue("RenameProperties", "true"));
 RenameEvents = XmlConvert.ToBoolean(vars.GetValue("RenameEvents", "true"));
 KeepPublicApi = XmlConvert.ToBoolean(vars.GetValue("KeepPublicApi", "true"));
 HidePrivateApi = XmlConvert.ToBoolean(vars.GetValue("HidePrivateApi", "true"));
 ReuseNames = XmlConvert.ToBoolean(vars.GetValue("ReuseNames", "true"));
 UseUnicodeNames = XmlConvert.ToBoolean(vars.GetValue("UseUnicodeNames", "false"));
 UseKoreanNames = XmlConvert.ToBoolean(vars.GetValue("UseKoreanNames", "false"));
 HideStrings = XmlConvert.ToBoolean(vars.GetValue("HideStrings", "true"));
 Optimize = XmlConvert.ToBoolean(vars.GetValue("OptimizeMethods", "true"));
 SuppressIldasm = XmlConvert.ToBoolean(vars.GetValue("SuppressIldasm", "true"));

XmlMapping = XmlConvert.ToBoolean(vars.GetValue("XmlMapping", "false"));
 RegenerateDebugInfo = XmlConvert.ToBoolean(vars.GetValue("RegenerateDebugInfo", "false"));

At least when setting the KeepPublicApi to false, some obfuscation happened.

https://stackoverflow.com/questions/28031859/rename-an-internal-class-with-obfuscar has a nice example of a config file. Now, from what I see, the BAML is used to quickly figure out which elements of the UI refer to classes by name. That is obviously smart, yet I would like to rename the XAML entries as well.

I also figured out that the BAML/XAML tree is still stored in the assembly as it was present in the original solution, so no reordering takes places in any way. Not a total failure, but not great either because of this.

ConfuserEx (Abandonded)

Finally something that is not a landing page. Huray ! Last update… 26 January 2019.. still it might work and it is open source. The projecct has indeed be discontinued since 1. July 2016.

In any case, a run of it did behave similarly as Obfuscar. The application started and didn’t get further,probably because of the missing DLL’s. I might need to fix that problem if no obfuscator gets through it. In any case, the XAML was effectively gone after obfuscating; or the dotPeek decompiler stopped trying. I am not entirely sure what it is yet.

After spending some hours on this problem, the problem seems to be in the renaming strategy used. I am not yet sure whether I will blame confuserex or my own program, given that Obfuscar had exactly the same error as this one. Then again maybe they both are based on the same source, so they might both be suffering from the same bug.

In any case, performing a ‘none’ protection did not damage the original assembly, which is already a good sign. Also nice was that there was a debug protection in place which caused the applciation to bark when a debugger tries to connect.

Skater Light (Does not  obfuscate at all)

Also Skater is a piece of software to buy. They do have a free version, named SkaterLight. Oddly enough… this feels a lot like chinese spyware. Seriously.

  1. it worked at the first attempt. So I was a bit skeptical… I decompiled the generated assembly and lo and behold the thing was just not obfuscated whatsoever.
  2. after installing it, it actually ran with elevated privileges (I know that because I could not read the generated assembly)
Yes… ‘Full Obfuscation’
Just don’t believe it

Dead Ends

  • Eazfuscator (not free) – Next is Eazfuscator because they seem very eager to actually deal with the WPF/XAML issue. Oh well.. forget it. Not free anymore. This is the point where I considered whether it would be possible to use a decompile tool to decompile an obfuscator, remove the licensing restrictions and continue. There is a certain beauty to this approach: if the obfuscator sucks, then we can easily do that, which makes it pointless to actually use then
  • CodeFort (disappeared) – was mentioned as another option which works well with XAML. Yet, latest udpate on the twitter feed was 2010 and the domain itself became a lnading page.
  • Agile.NET (non free)
  • FXProtect (disappeared)
  • ILProtector (not free) – has gone commercial since version 2.0.17
  • Babel.NET (not free) – https://www.babelfor.net/downloads/
  • SharpObfuscator (abandonded 2007) – was at https://archive.codeplex.com/?p=sharpobfuscator 
  • Goliath.NET (not free)
  • .NET Reactor (not free) Nice page
  • .NETGuard (not free)
  • Smart Assembly (excessive pricing)
  • CodeVeil – encrypts the DLL before executing it. In the end this might be a better option than ‘obfuscating’ it. Drawback is of course that we have a single point of failure. Another drawback is that it is a chinese product and only a trial version.
  • CryptoObfuscator (not free)
  • Rummage (fair pricing)
  • Xenocode (abandonded)
  • DeepSea Obfuscator
  • MaxtoCode
  • MPRESS
  • Spices.Net

The link https://github.com/de4dot/de4dot/tree/master/de4dot.code/deobfuscators is interesting because it lists existing obfuscators 1) that can be deobfuscated 2) were sufficiently know to bother deobfuscating 3) how difficult each one is

A new object cache for BpmDj

In BpmDj we load objects on demand: every time a particular object is accessed we load it from the database. This process happens automatically, and is implemented through a dictionary which maps an object id to a runtime representation.

In Java, this dictionary was a WeakDictionary, which is a dictionary from which values can be removed by the garbage collector. When when they got removed and the program accessed that object again, we would load it fresh from the database. This poor man caching is not particularly good because any garbage collect will remove all loaded (but unreferenced) objects, forcing the program to reload those object again. Even if the particular object is often used.

To solve that, we could force references to stay in memory by means of a round robin queue. Every time an object is accesed it is put in the next position in the buffer. As such, we ensure that the cache keeps X instances alive.

Sadly that strategy is unable to deal with a burst of requests. Any often used object will simply be pushed out of the buffer when a batch of new objects is loaded (like for instance when the song selector opens).

To alleviate this problem, we can, with each access, gradually increase the stickiness of a cache item. This idea turned out to be fairly efficient:

  • every entry has a position in the buffer. Whenever the entry is hit, it moves to half its original position.
  • every new element is placed in the middle of the buffer.

This strategy leads to a distribution where often used elements are in front of the buffer. Lesser used elements slowly walk their way out of the buffer until they are evicted. To avoid that items become too sticky (e.g: there can be items that have just enough been accessed to never leave the buffer again), it is useful to add a random element to this

  • reposition an element to a random position between 0 and ratio * originalRank.

One could argue that having too many object id’s and too few actual objects would be a cause of concern, and it clearly is. Nevertheless, there often is a space tradeoff between holding on to an object and using its id.

The image shows the buffer of a cache of capacity 100, with 800 distinct element randomly accessed. The access pattern was shaped according to a power law distribution. The front of the cache are those that are more sticky than the later part of the buffer. The height of each entry indicates its priority in the emitter.

The following picture shows the difference between 3 types of cache. The first is the roundrobin mentioned earlier, the second is a cache which keeps backreferences and the elevator cache is the one implemented here.

The data on which this was ran was the retrieveal of all startup objects BpmDj need, including the opening of the song selector. The total object count was 133632, of which 70291 unique ones.

XUnit vs MSTest

After having tested both of them extensively I can draw the following conclusion: MSTest is definitely the winner. Why ?

XUnit

  • buggy as hell. For a testframework this is kinda weird
  • very very slow
  • really confused about the tests that are available
  • No standard output. Yes I know you can redirect it, still they should not steal my debug output in the first place.
  • Different assertions than MSTest, and they are badly implemented at that (E.g: an assertion finding the content of a collection will simple iterate over all elements. It is truly painful to see how far computer scientists have sunken)
  • crashes VS2019 when in auto-hide
  • Talks about [Theories] and [Facts] instead of [TestMethod], just some ‘cool’ jargon and indeed far removed from reality.

MSTest

Does not have the same level of ‘we are so cool but can’t program’ fuckery as Xunit

Allthough this post is small, nobody seems to care to say how bad xUnit exactly is.

Scoping rules in WPF

Staticresources are resolved at parsing time

A staticresource is read when the xaml is read. This is demonstrated in the following example

<Style TargetType="{x:Type TextBlock}">
 <Setter Property="Foreground" Value="{StaticResource NormalTextColor}"/>
 </Style>

<SolidColorBrush x:Key="NormalTextColor" Color="Yellow"></SolidColorBrush>

At the moment the style is created, NormalTextColor is not defined yet. And so it stays whenever later that style is applied. If we swap the NormalTextColor definition and the Style then it will be a fixed yellow.

Dynamicresources are resolved whenever necessary

If we modify the StaticResource in a DynamicResource, then that example will behave correctly, and every textblock will have a yellow foreground.

Resource Lookup

The resource lookup goes from child to parent.

<Window x:Class="WinWpfTests.MainWindow"
 Title="MainWindow" Height="450" Width="800">
 <Window.Resources>
   <SolidColorBrush x:Key="NormalTextColor" Color="Orange"></SolidColorBrush>
 </Window.Resources>
 <StackPanel Grid.Row="1" Orientation="Horizontal">
   <StackPanel.Resources>
     <SolidColorBrush x:Key="NormalTextColor" Color="Red"></SolidColorBrush> 
   </StackPanel.Resources>
 <TextBlock Text="Some text"/>
 </StackPanel>
</Window>

This means that any textblock within the stackpanel will be colored red, while the textblocks outside the stackpanel yet inside the window will be orange. And ifg the application.xaml is defined as in our first example, then any other window will be yellow.

It might be necessary to restyle multiple controls

Whenever a textblock is used it will have the provided style. A label however has its own foreground color defined, and so requires an extra style.

<Style TargetType="{x:Type TextBlock}">
 <Setter Property="Foreground" Value="{DynamicResource NormalTextColor}"/>
 </Style>
<Style TargetType="{x:Type Label}">
 <Setter Property="Foreground" Value="{DynamicResource NormalTextColor}"/>
 </Style>

Resource lookup through templates

ControlTemplates provide a way to render a particular element differently.  The  dynamic lookup still goes from the lexical point of insertion up the logical tree. Thus the following fragment

 <Style x:Key="ModifiedLabel" TargetType="{x:Type Label}">
 <Setter Property="Template">
 <Setter.Value>
 <ControlTemplate TargetType="{x:Type Label}">
 <StackPanel Orientation="Vertical">
 <TextBlock Text="Zhe legend"/>
 <ContentPresenter/>
 </StackPanel>
 </ControlTemplate>
 </Setter.Value>
 </Setter>
 </Style>

Will render both the ‘Zhe legend’ as well as the actual content of the label using the same dynamicresource: that is they will both have the same color, even if the controltemplate was defined in a different file. (One could expect that ‘Zhe Legend’ would follow a lookup hierarchy going from the definition of the template, while the ContentPresenter would follow a different hierarchy)

The logical parent with controltempaltes

The logical parent of the contentpresenter is the controltemplate, which is the same as the control being templated. Thus if we set the template of a label to something, and then define a resources in the controltemplate (as ControlTempalte.Resources), then these resources are part of the label, and thus are visibly to dynamicresources applied to the contentpresenter.

Yet, if we place the resources to a subelement within the controltemplate, then they are not part of the label, and thus not part of the logical chain of parents from the contentpresenter.

Under the assumption that the default color has been set to red in the App.xaml, we have two ways to define a controltemplate, with two different results

 

Who has priority ?

 

Because both the ControlTemplate and the original instantiation both access the same resource dictionary frame, it is useful to figure out who has priority. The answer is: the controltempalte its resources are applied first, afterwards those defined in the actual instantiation of the control.

Shortest Distance from a Point to a Line segment

A straight translation from python to java; taken from  http://www.fundza.com/vectors/point2line/index.html

public double distance2(Point2D pnt)
  {
    double lineVecDx=to.x-from.x;
    double lineVecDy=to.y-from.y;
    double pntVecDx=pnt.getX()-from.x;
    double pntVecDy=pnt.getY()-from.y;
    double lineLen=dist(lineVecDx,lineVecDy);
    double pntVecLength=dist(lineVecDx,lineVecDy);
    double lineUnitvecDx=lineVecDx/pntVecLength;
    double lineUnitvecDy=lineVecDy/pntVecLength;
    double pntVecScaledDx=pntVecDx/lineLen;
    double pntVecScaledDy=pntVecDy/lineLen;
    double tx = lineUnitvecDx * pntVecScaledDx;
    double ty = lineUnitvecDy * pntVecScaledDy;
    double t = tx+ty;
    if (t<0) t=0;
    else if (t>1) t=1;
    double nearestX=lineVecDx*t;
    double nearestY=lineVecDy*t;
    return dist(nearestX-pntVecDx, nearestY-pntVecDy);
  }

double dist(double dx, double dy)
  {
    return Math.sqrt(dx*dx+dy*dy);
  }

A replacement for the JavaFx FontMetrics class for JDK9

Most people used an internal javafx FontMetrics class, which has been deprecated in version 9 of the jdk. That means that your app that relied on this will simply not work anymore. Below is a simple replacement that will provide the computeStringWidth as well as ascent, descent and lineHeight. The produiced values are exactly the same as if they were called from the FontMetrics class itselve.

import javafx.geometry.Bounds;
import javafx.scene.text.Font;
import javafx.scene.text.Text;
public class FontMetrics
{
 final private Text internal;
 public float ascent, descent, lineHeight;
 public FontMetrics(Font fnt)
 {
 internal =new Text();
 internal.setFont(fnt);
 Bounds b= internal.getLayoutBounds();
 lineHeight= (float) b.getHeight();
 ascent= (float) -b.getMinY();
 descent=(float) b.getMaxY();
 }

 public float computeStringWidth(String txt)
 {
 internal.setText(txt);
 return (float) internal.getLayoutBounds().getWidth();
 }
}

Adadelta lovely optimizer

I just tested adadelta on the superresolution example that is part of the pytorch examples. The results are quiet nice and I like the fact that LeCun’s intuition to use the hessian estimate actually got implemented in an optimizer (I tried doing it myself but couldn’t get through the notation in the original paper).

Interesting is that the ‘learning-rate of 1’ will scatter throughout the entire space a bit more than what you would expect. Eventually it does not reach the same minima as a learning rate of 0.1.

In the above example we also delineated each epoch every 20 steps. That means, when step%20==0 we cleared all the gradients. Which feelsd a bit odd that we have to do so. In any case, without the delineation into epochs the results are not that good. I do not entirely understand why. It is clear that each epoch allows the optimizer to explore a ‘new’ direction by forgetting the garbage trail it was on, and in a certain way it regularizes how far each epoch can walk away from its original position. Yet _why_ the optimizer does not decide for itself that it might be time to ditch the gradients is something I find interesting.

Orthogonal weight initialization in PyTorch seems kinda weird

I recently gave deep learning another go. This time I looked into pytorch. At least the thing lets you program in a synchronous fashion. One of the examples however did not work as expected.

I was looking into the superresolution example (https://github.com/pytorch/examples) and printed out the weights of the second convolution layer. It turned out these were ‘kinda weird’ (similar to attached picture). So I looked into them and found that the orthogonal weight initialization that was used would not initialize a large section of the weights of a 4 dimensional matrix. Yes, I know that the documentation stated that ‘dimensions beyond 2’ are flattened. Does not mean though that the values of a large portion of the matrix should be empty.

weights_grmpf

The orthogonal initialisation seems to have become a standard (for good reason. See the paper https://arxiv.org/pdf/1312.6120.pdf), yet is one that does not work together well with convolution layers, where a simple input->output matrix is not stratight away available. Better is to use the xavier_uniform initialisation. That is, in the file model.py you should have an initialize_weights as follows:

 def _initialize_weights(self):
   init.xavier_uniform(self.conv1.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv2.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv3.weight, init.calculate_gain('relu'))
   init.xavier_uniform(self.conv4.weight)

With this, I trained a model on the BSDS300 dataset (for 256 epochs) and then tried to upsample a small  image by a factor 2. The upper image is the small image (upsampled using a bicubic filter). The bottom one is the small picture upsampled using the neural net.

The weights we now get at least use the full matrix.

The output when initialized with “orthogonal” weights has some sharp ugly edges: