Mottek Archive 2011

Debugging and the Scientific Method

Once a program reaches a stable state in terms of architecture, design and functionality, debugging becomes the predominant activity in the development process (unless the architecture or design were flawed to begin with). The largest part of the debugging process is to find the root cause for errors, whereas the actual fixing is usually a much simpler activity: Many bugs are fixed with one line changes, or even one character changes.

Verifying Assumptions

My approach to understanding why a program fails is to identify a set of hypotheses, and then test one after another. There are a couple of important points in this process: Testing can be done via a various mechanisms, such as debugging, logging, modifying input data, modifying program code, modifying timings, or changing the execution environment. Each test should produce unambiguous results, and sequential verification is important in order to allow interpretation of those results. The hypotheses need to be simple enough and non-overlapping to reduce the set of possible root causes reliably, trying to verify too much at the same time can be quite difficult.

Reducing the Debugging Work

The actual verification process (implementing and running the tests) is usually simple but time consuming, which means that I try to automate as much as possible. But the true way to reduce debugging work is to identify the right set of hypotheses, and this is unfortunately also the toughest part of debugging. It is usually a mix of domain knowledge, intuition and a systematic approach which pay off. Domain knowledge can unfortunately only be gained by actually working with the program code, it is very rare that a problem is generic enough that it can be solved without deeper understanding about the code. Intuition however is another key component altogether: It allows to make assumptions without the deepest possible knowledge, rather, deliberate ignorance helps to prevent digging in too deep (after all, understanding a system down to the metal takes a long time), and produce hypotheses e.g. based on previous experience faster. There is of course the risk that such intuition based hypotheses can lead nowhere, and I have had many many situations where I declared victory too early only. Humility is therefore a very important parter to intuition!

Why Debuggers are Bad

Formulating a hypothesis why a program is broken requires thinking and knowledge. Using a debugger usually increases the knowledge about the inner workings of a program, but it can easily stand in the way of the thinking part when coming up with ideas about error root causes. Debuggers are good to verify mini-hypotheses when stepping through code, but they are very cumbersome at getting to the big picture and testing more complex problems (plus they are worthless for testing timing issues). My impression is that they are a popular tool because they tend to give immediate results on these mini-hypotheses, which gives the illusion of making progress. For me, far more important is to read code, work with a peer, and use tracing and logging to understand the problem.

Deliberate Breakage

One of my favourite approaches to produce hypotheses is to take the program in the two states of working and broken, and reduce the difference in implementation between them, until the smallest difference is identified as the root cause. In cases where a bug is the result of changes due to ongoing implementation work, this is very simple, git for example allows to bisect a series of changes and mark working ones and broken ones until the offending change is found.

This approach can be extended by looking at different definitions for working and broken. As an example, working does not necessarily mean that it has to apply to the same program which is broken, but it can also mean a completely different program which however shares some design or implementation with the broken program. In relation to Blunt, I have been able to test some assumptions about the packet flow by checking Blunt 1 versus Blunt 2, even though both are internally very different.

It is important to keep a very open mind when looking for these instances of a working system, having them is the only way to provide a solid foundation for further debugging, and to keep sanity. If everything is broken, it is a pretty rocky road to make any progress. It is not impossible though, and in those cases, I usually start removing parts of the broken program until at least something works, and then work backwards by adding code.

Now, how about Blunt?

All of the above also applies to reverse engineering, which is still the biggest chunk of work to get Blunt running. Reverse engineering is in some sense simpler than debugging since the code is known to work, and understanding it is achieved by disassembly, or black box testing techniques to verify assumptions about how the code works. I noticed that persistence and thoroughness really pays off in this area, and it seems that Blunt is very near to finally work as designed :) Stay tuned for more!

Dear TCommTool

What is it you really want? - I've been experimenting with Blunt 2, setting up an internet connection using NIE and PPP over Bluetooth, and overall it appears to be quite stable, with the only very annoying problem that the data is not handed back from Blunt to the NewtonScript layer (and thus NIE). The problem appears to be the inner working of the TCommTool class, which handles this interaction. What is not entirely clear right now is how to indicate that data has arrived for further processing. This results in apparent packet loss on the NIE level, and unnecessary packet retransmission or aborted connections. It might be the most effective to debug the data flow, but the drawback of using Hammer is that it needs USB to serial drivers installed under Mac OS Classic. Well, I've gotten very far without Hammer, maybe now it's again Hammer time!

More reverse engineering results

I've been working with the NewtonOS sampler program a bit more, and started to work a bit more on the inter-task communication. The more detailed results are on the NewtonOS Internals pages, so far one important finding is that NewtonOS is very lightweight what comes to memory management, and relies on a flat memory model with very little access restrictions. This means that memory must be allocated and deallocated with slightly different strategies than in a more conventional OS, specifically, data is shared more openly between tasks and must be kept valid as long as any task might be using it - the kernel does not create own copies of data.

Technical Information on the NewtonOS

My "NewtonOS Sampler" is now up on github, so far it's just a skeleton though. I'll be using the Newton DDK to implement examples of the OS services, additionally, Walter Smith and Paul Guyot have a collection of very good documents on the OS itself.

Getting to the bottom of it

The latest changes to Blunt 2 fixed the crash when sending a large number of data blocks, which was caused by memory management issues around the TUPort class used for communication between the tasks in Blunt. More specifically, the allocation and deallocation of memory for the messages was causing problems. I still need to find the right approach though, and it is probably best done via a small test program which uses tasks, shared memory, messages and ports.

Stabilizing Blunt 2

I've gotten Blunt 2 quite a bit more stable, and have also added missing functionality to send larger data packets in smaller chunks. I'm testing at the moment with a Conceptronic CBT100C and a PICO card, which both seem to be working quite well with 230kbps as the serial speed.

Github continues to be very useful to manage the work and track issues (I should probably indicate in the git commits also the ID of any issue I'm fixing), and for those interested, the git repository actually contains the current working stack as an installable package in the "Bluetooth Setup.zip" file ;)

One remaining stability issue is long term stability though, which I think is related to either improper memory management for the messages sent between the server and the other parts of the system, or problems with the interrupt handlers being paged out.

Blunt 2 status

Here's a quick update on Blunt 2. I've moved the code to Github and started to track issues there as well. My development environment is TextMate, and I use MPW just for compiling. The simple serial line based logging works reasonably well, and I've not really missed using Hammer as a debugger (Hammer is still essential for reverse engineering by executing code, for simple code analysis, I use my online database of the NewtonOS ROM).

Most problems in Blunt 2 are not in the actual Bluetooth stack, but around interfacing with the Bluetooth hardware: PC Cards are using a simple 16450 UART, and sending and receiving data is not trivial due to the lack of documentation for the serial chip interface on the Newton. I've switched to simple non-interrupt based sending of data, which means some performance penalty but should get around problems I've seen with sending bulk data. The other problematic interface is between the Bluetooth stack and the CommTool, which in turn manages the interface to the NewtonScript world. CommTools are even less documented unfortunately, so I am still lacking asynchronous sending functionality and means to accept connections instead of initiating them.

Progress is expectedly slow, on the other hand, it is nice to work with the probably best notebook Apple ever made, my trusty old Pismo :)

Picking up the Gauntlet

After lengthy GTD experiments (more on that later), programming a bit for the iPhone and bringing development tools for the Newton into the 21st century, I'm thinking it's time to revisit Blunt 2. I have a simplified development setup where I need to use Mac OS 9 only for compiling, and can use e.g. TextMate for editing, a simple serial line for debug output and RDCL for package installation.

I uploaded the code as a first step to github, which should allow much better tracking of changes and experiments.

Newton ROM Cross Referencer

I moved my tool to cross reference the Newton ROM now to 40Hz.org. It's usage is quite simple, just enter the function name you want to view into the text field (use % as a wild card character) and press "View" - the resulting listing is hyper linked to allow further digging into the ROM.

Experimenting with iPhone Development

As a programmer, I'm trying to continuously learn new technologies, and playing around with iPhone development was inevitable. One aspect which motivated me goes actually back to the NeXT era, and that is Objective-C. Back in the days, we just got started with a friend and his small company to dive into C++ since it was the logical next step from C, but Objective-C and Brad Cox' idea of Software-ICs seemed already then very appealing. It was unfortunately limited on the commercial side to NeXTStep and later OpenStep, whereas C++ was much easier to deploy.

A while ago I started therefore to experiment with iPhone app development to see what the fuzz is about. Using Objective-C is very refreshing, it is great to have a compiled but also very dynamic language. It won't beat NewtonScript though ;). The result of my experiments is now available in the iTunes app store: a simple, free expense tracking software for personal use called Geld, including DropBox support. Enjoy!

Frankennewton

Thanks to Frank Gruendel, I'm now the proud owner of three badly treated Newton MessagePads! My intention in acquiring them was to investigate case modding options. While I do like Jonathan Ive's original design of the MP2x00 series, there is some 90's bulkiness which would be great to shed. The most radical option is to take just the motherboard and display and construct a new housing. But even shaving off bits and pieces from the regular case might help.

Developing Newton apps with Einstein, TextMate and tntk

In this post I want to describe my setup for developing Newton apps on Mac OS X using Einstein, TextMate and tntk.

Einstein is working very well as a replacement for an actual Newton during development. An important enabler is the AppleScript interface, which allows "remote controlling" the emulator. TextMate is my editor of choice for other development projects already, and extending it to support NewtonScript and interfacing with Einstein is very easy. And finally, tntk is able to generate Newton packages of reasonable complexity if certain limitations are kept in mind.

Setup

Setting up Einstein is straightforward, and instructions on getting a ROM file are just an internet search away. I'm using the the same screen resolution on the emulator as on an MP2100, and I install a list of very useful packages:

Unfortunately, BugTrap is not able to save trapped errors to the Notepad, but it is possible to use ViewFrame's intercept functionality to break on exceptions, which is a very good replacement.

tntk does not need much setup, it just has to be somewhere in the path on Mac OS X, e.g. copied to /usr/local/bin or /usr/bin. The usage of standard NewtonOS elements like protos and constants requires a platform file, which is part of the Newton Toolkit (found in the Platforms folder after installing the NTK on the Mac and named e.g. Newton 2.1).

I've created a NewtonScript bundle for TextMate (a current snapshot can be downloaded from the git repository on SourceForge, the unpacked folder has to be renamed to NewtonScript.tmbundle) to enable syntax highlighting and simplify package compilation and installation. Package installation is not very elegant yet, it requires a small AppleScript named Install.scpt in the same folder where the source files under development are located. I am using right now a package specific script, but it could possibly also be simplified (Package name, symbol and path need to obviously replaced with real values):

tell application "Einstein"
    do newton script "GetRoot().|<package symbol>|:Close()"
    do newton script "SafeRemovePackage(GetPkgRef(\"<package name>\", GetStores()[0]))"
    install package "<path to package file>"
end tell

It closes the application running in Einstein, removes the package, and then installs the new package. The TextMate bundle is configured to run the script via the shortcut Command-Shift-B.

Usage

The development cycle using tntk and Einstein is slightly different from using the NTK. One major difference is the absence of the Inspector, but ViewFrame is a very good replacement, and in some areas even superior. One big advantage is the ability to use standard source code version control systems since tntk uses text files. It has always bothered me to not have good change control when developing Newton applications.

Converting existing Newton applications is not straightforward, but possible if the original NTK project is available. The NTK allows saving the project as a text file, which can be modified into a tntk compatible source file. In general, the original developer documentation by Apple is an excellent resource to get started, and ViewFrame is very useful in understanding the inner workings of applications.

Happy hacking!

Autoparts with tntk

Autoparts for the Newton are packages which do not have any visible user interface, but add functionality behind the scenes. They are usually listed in the Extensions folder of the Extras drawer.

It is possible to use tntk to create autoparts, and this blog entry demonstrates how to add a module (which admittedly doesn't do much) to the NiftyDrop backdrop application.

The example source can be found on SourceForge.

Project File

The tntk project file is very similar to a project file for a regular application, the only difference is that it uses the part type auto instead of form.

Below you can see that the project contains only one part, which has the type auto and is made up of only one source file, main.newt.

{
    parts: [
        {
            main: "main.newt",
            files: [],
            type: 'auto
        }
    ],

The package name definition and the reference to the platform file are next:

    name: "NewtAutopartExample:40hz",
    platform: "/Applications/Newton/NTK 1.6.4/Platforms/Newton 2.1"
}

The platform file is part of the Newton Toolkit, and tntk uses it to find predefined constants such as proto names.

Main View

The only source file the example uses is main.cpp.

The first part of the file are common definitions of constants (the naming convention I chose is to start constants with a k, methods with an M and fields with an f):

constant kAppSymbol := '|NewtAutopartExmple:40Hz|;

constant kNiftySym := '|NiftyDrop:HyprMynd|;
constant kNiftyRegistry := '|Registry:NiftyDrop:HyprMynd|;
constant kModuleSym := '|Nifty:NewtAutopartExmple:40Hz|;

Next follows a function used to add a module to NiftyDrop:

constant kAddToRegistry := func (module) begin
    local reg := GetGlobals().(kNiftyRegistry);
    local sym := EnsureInternal (module.symbol);
    if not reg then begin
        reg := TotalClone ({
            modules: {},
            prefs: {},
            infoItems: {}
        });
        GetGlobals().(kNiftyRegistry) := reg;
    end;
    reg.modules.(sym) := module;
end;

The function creates NiftyDrop's module registry if it doesn't exist already, and uses the new module's symbol as a slot name to refer to the module code.

A function to remove a module from NiftyDrop is next, it simply removes the slot from the module registry and also from the set of active modules:

constant kRemoveFromRegistry := func (sym) begin
    local reg := GetGlobals().(kNiftyRegistry);

    RemoveSlot (reg.modules, sym);
    p := GetAppPrefs (kNiftySym, {});
    SetRemove (p.activeModules, sym);
    RemoveSlot (p, sym);
    EntryChangeXmit (p, nil);
end;

Next up is the actual module view. As the other elements defined so far, it is a constant, declared for later use. The view itself is just a gray rectangle with the default size of 100x60:

constant kModuleView := {
    viewClass: clView,
    viewFlags: vVisible,
    viewFormat: vfFillGray,
    viewJustify: vjParentLeftH + vjParentTopV,
    viewBounds: {left: 0, top: 0, right: 100, bottom: 60},

NiftyDrop expects a number of additional slots, most importantly the name and symbol:

    resizeable: true,
    symbol: kModuleSym,
    name: "Example Module",
    minWidth: 100,
    minHeight: 60,
    widthIncr: 1,
    heightIncr: 1,

};

The view defined in kModuleView offers no user interaction and has no other visible elements than just a gray rectangle. It would be very simple to add new views in an array in the stepChildren slot of the view.

All of the code comes together in the install and remove scripts. The Newton Toolkit uses a slightly different naming convention and shifts some code around. Instead of the InstallScript and RemoveScript function used in the NTK, I'm using a slightly different setup below.

First, the script run at installation time adds the module to the module registry, and let's NiftyDrop know about it:

{
    devInstallScript: func (partFrame, removeFrame) begin
        call kAddToRegistry with (kModuleView);
        if Visible (GetRoot().(kNiftySym)) then GetRoot().(kNiftySym):prefsChanged ('moduleAdded, nil);
    end,

The script run at removal time (e.g. when freezing or deleting the extension, or removing the card it is stored on) closes NiftyDrop and then removes the module from the registry:

    devRemoveScript: func(partFrame) begin
        if Visible (GetRoot().(kNiftySym)) then GetRoot().(kNiftySym):Close ();
        call kRemoveFromRegistry with (kModuleSym);
    end,

The actual installation script runs the devInstallScript and returns the removal script to the system:

    InstallScript: func (partFrame) begin
        local removeFrame := EnsureInternal ({removeScript: partFrame.devRemoveScript});
        partFrame:devInstallScript (partFrame, removeFrame);
        return removeFrame;
    end,
};

Notes

tntk is creating NewtonScript packages in a quite unique fashion using NEWT/0. Technically, a tntk program is exectuted as NewtonScript, and the return value of the program is input to the package creation step. That means that a program is not just compiled, but it generates itself. The return value is the code between the last pair of opening and closing braces in the example above, i.e. a frame containing the installation and removal scripts. These scripts in turn refer to the rest of the code.

One caveat with this process is that NEWT/0 captures the complete execution environment and returns it. Since it is executing a function, and since NewtonScript functions are proper closures, care must be taken to not pull in objects which are not needed. If in the example above the elements are not declared as constants, they will be passed to any function within the code as the environment for the function's execution, which in extreme cases causes the NewtonScript interpreter to run out of memory.

An improvement of the package generation and compilation step would be to use NEWT/0 only for function bodies, and use a more traditional code generation process to generate the package code.