Webb Master

Home AboutRSS

Create Inline Data from csv in Haxe

One of the things that makes haxe really powerful but can be intimidating is their macro system. With it, you can write code that generates code at compile time, allowing you to do useful things like checking the validity of data files or even transform them into literals in the code. There’s much more you can do with it, but loading data at compile time is something I see a lot in game development (including castle, which I blogged about previously).

Here is also a blog post about doing it with json to fill in the members of a class. What I present here is a much simpler example that I had to do for something I’m working on unrelated to gaming. And I think with this simpler example, it’s easier to tell what’s going on than with a lot of the other snazzier code out there.

I start with a csv file:

canonical,pos_type,clause,tense,hiragana,katakana,katakana_chouonpu
ため,名詞,非自立,一般,ため,タメ,タメ
まんま,名詞,非自立,副詞可能,まんま,マンマ,マンマ
以上,名詞,非自立,副詞可能,以上,イジョウ,イジョー
際,名詞,非自立,副詞可能,際,サイ,サイ
ふし,名詞,非自立,一般,ふし,フシ,フシ
種,名詞,非自立,一般,種,シュ,シュ
ところ,名詞,非自立,副詞可能,ところ,トコロ,トコロ
様,名詞,非自立,助動詞語幹,様,ヨウ,ヨー
うち,名詞,非自立,副詞可能,うち,ウチ,ウチ
程,名詞,非自立,一般,程,ホド,ホド
そう,名詞,特殊,助動詞語幹,そう,ソウ,ソー
せい,名詞,非自立,一般,せい,セイ,セイ
自身,名詞,非自立,副詞可能,自身,ジシン,ジシン
ごと,名詞,非自立,副詞可能,ごと,ゴト,ゴト
とき,名詞,非自立,一般,とき,トキ,トキ

And I don’t want to have to load this csv at runtime. I want to turn it into an array literal. This can really be handy when you’re running js in the browser or also save work when making apps. And while I didn’t add it in this example, it would be super easy to validate the csv and make compilation error out if it fails validation, like this example does with json. Here I present two ways of doing it, either as an array of arrays or as an array of Dynamics:

package;

#if macro
import sys.io.File;
import haxe.macro.Expr;
import haxe.macro.Context;
#end

class ArrayGenerator
{
    macro private static function arraysFromCSV(fileName:String)
    {
        var input = File.read(fileName, false);
        var lines = [];

        try {
            while (true) {
                var line = input.readLine();
                var cols = line.split(',');
                lines.push(macro $v{cols});
            }
        }
        catch (ex:haxe.io.Eof) {}

        return macro $a{lines};
    }

    macro private static function objectsFromCSV(fileName:String, header:Array<String> = null)
    {
        var input = File.read(fileName, false);
        var lines = [];

        try {
            while (true) {
                var line = input.readLine();
                var cols = line.split(',');
                if (header == null) header = cols;
                else {
                    var obj = [];
                    for (i in 0...header.length) {
                        obj.push({field: header[i], expr: macro $v{cols[i]}});
                    }
                    lines.push({expr: EObjectDecl(obj), pos: Context.currentPos()});
                }
            }
        }
        catch (ex:haxe.io.Eof) {}

        return macro $a{lines};
    }

    private static var nounsAsArrays:Array<Array<String>> = ArrayGenerator.arraysFromCSV('nouns.csv');
    private static var nounsAsObjects:Array<Dynamic> = ArrayGenerator.objectsFromCSV('nouns.csv');

    public static function getNounsAsArrays():Array<Array<String>>
    {
        return nounsAsArrays;
    }

    public static function getNounsAsObjects():Array<Dynamic>
    {
        return nounsAsObjects;
    }
}

class ArrayGeneratorTest
{
    static function main()
    {
        var nounsAsArrays = ArrayGenerator.getNounsAsArrays();
        trace(nounsAsArrays[1][0]);
        var nounsAsObjects = ArrayGenerator.getNounsAsObjects();
        trace(nounsAsObjects[2].canonical);
    }
}

I believe it’s a lot more robust to use a typedef instead of a Dynamic:

typedef Word = {
    var canonical:String;
    // and so on...
}

You can define that in code and use that to validate that what’s in the csv matches your expectations (again, at compile time) or even use macro magic(k) to derive a type as castle does. I’m also treating everything as strings which is fine for this case but likely not what most people ingesting csvs want. I leave fixing these things as an exercise for the reader. And probably near future me as I continue working on the thing that prompted this post.

Below is the output in js. Very compact and you can see all that beatiful data written out in js, not loaded at runtime:

// Generated by Haxe 3.4.2
(function () { "use strict";
var ArrayGenerator = function() { };
ArrayGenerator.getNounsAsArrays = function() {
        return ArrayGenerator.nounsAsArrays;
};
ArrayGenerator.getNounsAsObjects = function() {
        return ArrayGenerator.nounsAsObjects;
};
var ArrayGeneratorTest = function() { };
ArrayGeneratorTest.main = function() {
        var nounsAsArrays = ArrayGenerator.getNounsAsArrays();
        console.log(nounsAsArrays[1][0]);
        var nounsAsObjects = ArrayGenerator.getNounsAsObjects();
        console.log(nounsAsObjects[2].canonical);
};
ArrayGenerator.nounsAsArrays = [["canonical","pos_type","clause","tense","hiragana","katakana","katakana_chouonpu"],["ため","名詞","非自立","一般","ため","タメ","タメ"],["まんま","名詞","非自立","副詞可能","まんま","マンマ","マンマ"],["以上","名詞","非自立","副詞可能","以上","イジョウ","イジョー"],["際","名詞","非
自立","副詞可能","際","サイ","サイ"],["ふし","名詞","非自立","一般","ふし","フシ","フシ"],["種","名詞","非自立","一般","種","シュ","シュ"],["ところ","名詞","非自立","副詞可能","ところ","トコロ","トコロ"],["様","名詞","非自立","助動詞語幹","様","ヨウ","ヨー"],["うち","名詞","非自立","副詞可能","うち","ウチ","ウチ"],["程","名詞","非自立","一般","程","ホド","ホド"],["そう","名詞","特殊","助動詞語幹","そう","ソウ","ソー"],["せい","名詞","非自立","一般","せい","セイ","セイ"],["自身","名詞","非自立","副詞可能","自身","ジシン","ジシン"],["ごと","名詞","非自立","副詞可能","ごと","ゴト","ゴト"],["とき","名詞","非自立","一般","とき","トキ","トキ"]];
ArrayGenerator.nounsAsObjects = [{ canonical : "ため", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "ため", katakana : "タメ", katakana_chouonpu : "タメ"},{ canonical : "まんま", pos_type : "名詞", clause : "非自立", tense : "副詞可能", hiragana : "まんま", katakana : "マンマ", katakana_chouonpu : "マンマ"},{ canonical : "以上", pos_type : "名詞", clause : "非自立", tense : "副詞可能", hiragana : "以上", katakana : "イジョウ", katakana_chouonpu : "イジョー"},{ canonical : "際", pos_type : "名詞", clause : "非自立", tense : "副詞可能", hiragana : "際", katakana : "サイ", katakana_chouonpu : "サイ"},{ canonical : "ふし", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "ふし", katakana : "フシ", katakana_chouonpu : "フシ"},{ canonical : "種", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "種", katakana : "シュ", katakana_chouonpu : "シュ"},{ canonical : "ところ", pos_type : "名詞
", clause : "非自立", tense : "副詞可能", hiragana : "ところ", katakana : "トコロ", katakana_chouonpu : "トコロ"},{ canonical : "様", pos_type : "名詞", clause : "非自立", tense : "助動詞語幹", hiragana : "様", katakana : "ヨウ", katakana_chouonpu : "ヨー"},{ canonical : "うち", pos_type : "名詞", clause : "非自立
", tense : "副詞可能", hiragana : "うち", katakana : "ウチ", katakana_chouonpu : "ウチ"},{ canonical : "程", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "程", katakana : "ホド", katakana_chouonpu : "ホド"},{ canonical : "そう", pos_type : "名詞", clause : "特殊", tense : "助動詞語幹", hiragana : "そう", katakana : "ソウ", katakana_chouonpu : "ソー"},{ canonical : "せい", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "せい", katakana : "セイ", katakana_chouonpu : "セイ"},{ canonical : "自身", pos_type : "名詞", clause : "非自立", tense : "副詞可能", hiragana : "自身", katakana : "ジシ
ン", katakana_chouonpu : "ジシン"},{ canonical : "ごと", pos_type : "名詞", clause : "非自立", tense : "副詞可能", hiragana : "ごと", katakana : "ゴト", katakana_chouonpu : "ゴト"},{ canonical : "とき", pos_type : "名詞", clause : "非自立", tense : "一般", hiragana : "とき", katakana : "トキ", katakana_chouonpu : "トキ"}];
ArrayGeneratorTest.main();
})();

Beautiful! Ready to be plopped into a browser.