F# Scribbles

My experiments with F#

Generate plain item list from a catalog in a web page

As part of my IoT related experiments, reading Exploring Ardunio book for some insight. In order to get the list of electronic parts for the projects, http://exploringarduino.com/parts list out all required items. Here the excercise to take the HTML of the catalog from the web site using Chrome and then used F# script to get the plain list.

In the HTML, you can see the item title in the format of “title={item name}”

I define a regex to parse that specific item on every line.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
open System
open System.IO
open System.Text.RegularExpressions

let parseLine (line: string) =
    let r = Regex.Match (line, """.*(title=\"(.*)\").*""")
    r.Groups.[2].Value

let parse text =
    let foldit seq = Seq.fold (fun acc l -> acc + Environment.NewLine + parseLine(l)) "" seq
    text |> foldit

let writefile content = File.WriteAllText(__OUTPUT__FILE, content)
File.ReadLines(__INPUT_FILE__)
    |> parse
    |> writefile

Here, the outcome

1
2
3
4
5
6
7
8
9
.1uF Electrolytic Capacitor
100 ohm Resistor
10kohm Potentiometer
10kohm Resistor
10uF Electrolytic Capacitor
150ohm Resistor
16x2 LCD
1kohm Resistor
...