One off Scripts

Posted by Alan Barr on Fri 23 February 2018

Languages

I am fascinated with languages of all kinds especially spoken, written, and computer languages. Languages vary greatly in how they construct their meaning, order, and think about concepts. I find languages satisfying to open up my creativity and think about ideas and concepts in ways that I would not have considered before. I enjoy learning computer languages and even though I may not know them all through and through seeing how they interpret ideas and deal with their strengths and weaknesses is very interesting.

Website Hosting and One Off Scripts

My website for a long time has been hosted by nearlyfreespeech.net and I went so far as to install wordpress, php and mysql. I did not use much of these features but it was a handy learning experience until I was ready to start tweaking PHP itself. These days a statically compiled site is the right level of effort I am willing to put into my website. I personally do not have much need for any dynamism in my website and if I do need it I can write some JavaScript quickly to achieve the effect. The great benefit I gain from using nearlyfreespeech.net is that I pay a reasonable cost and I do not maintain the infrastructure. I have considered amazon s3 to host my pages but it does seem to be a bit much to deal with setting up. Git Hub Pages could be an option as well but honestly I'm never going to be sure they'll be around forever. In order to work with nearlyfreespeech.net I have to work with the tools that my workspace gives me. Since Python, PHP and node were installed I ended up using various utilities to generate my website. For a while I looked for a static site generator and ultimately settled on a python one named pelican because I found a theme I liked. Once that was complete I later on realized I needed a sitemap and decided that I could use PHP to recursively parse a folder and write out xml. I achieved this but was not too happy with the results. PHP reminds me of the indomitable will of the human spirit. It worked and was good enough. Overtime I became very obsessed with website performance and making my pages as light as possible. I ended up using uglifyjs to minify my pages. I adore JavaScript but the event loop async nature of everything gets annoying every once in a while. In order to orchestrate this process together I have a bash script to call each one by one and copy/move/delete things as needed. Bash works but I have never truly loved the syntax of the shell languages. They're great at what they do but overall it's always been a sore spot for me. Python is great for some stuff but various apis succeed more than others. I just do not get much satisfaction out of writing it. Pandas and sklearn make up for any utility coding complaints I have for it though.

One Off

Utility scripts are various types of glue code. Things strung together by various things to achieve a pipeline of activities. Each one of the tool sets achieving a task in one form or another. Each language can excel in its own way at some task. In this case I finally learned how I could use fsharp interactive on nearlyfreespeech.net. Most of their machines they provide are freebsd based. Mono is available but its a bit out of date. It's no shock mono is a bit of an oddity that it even exists on linux I couldn't expect much interest to keep it up to date on freebsd. My PHP site map generator had reached the end of its existence. I wrote it so I could feel a little connected to the language after using it at the previous marketing companies I worked at.

Languages

For right now FSharp is my favorite language to work with. It is good and succinct at what it does and for me has full power in being either functional or object oriented. I am not a fan of extremes and the language enforces enough constraints that I do not feel I am losing much. Languages that enforce an extreme paradigm make me feel like I'm forced to use an abstraction when I do not prefer to. Functions as the unit of communication make a lot more sense for me. One off scripts tend to be imperative and if there is repetition an object can come in handy.

The steps for our site map generator should be minimal. We need to recusively get a lists of all .html files in a directory. I found a script on FSSnips for finding the files. I pass in the directory I want to search and store those in files. Ultimately I want to write out an xml file that features a list of webpages a crawler can find on my website. The output sitemap constructs this string and writes it out. The pages binding creates a big string of xml nodes in the format required for listing webpages. There could be room for improvement since the script assumes many things instead of taking some parameters are arguments. However, for a one of script I feel content that it achieves what I want it to do and I can pull it up in the future and understand pretty quickly what I was attempting to achieve.

    open System
    open System.IO
    //From Fssnips 
    let rec allFiles dirs =
         if Seq.isEmpty dirs then Seq.empty else
             seq { yield! dirs |> Seq.collect Directory.EnumerateFiles
                   yield! dirs |> Seq.collect Directory.EnumerateDirectories |> allFiles }

    let files = allFiles (seq ["/home/public/"])

    let htmlOnly = Seq.filter (fun (file: string )-> file.EndsWith(".html") = true ) files

    let replaceRoot(htmlfiles: seq<string>) =
        Seq.map (fun (file: string) -> file.Replace("/home/public/","https://www.alanmbarr.com/")) htmlfiles

    let replaceWithXMLNode(htmlfiles: seq<string>) =
        Seq.map (fun (file: string) -> sprintf "<url><loc>%s</loc></url>" file ) htmlfiles

    let pages = 
        htmlOnly 
        |> replaceRoot
        |> replaceWithXMLNode
        |> Seq.reduce (fun acc curr -> acc+curr) 

    let outputSitemap =
        let header = """<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">"""
        let footer = "</urlset>" 
        let xmlmap = header+pages+footer
        File.WriteAllText("/home/public/sitemap.xml",xmlmap)