Archive for category Computing

R running individual tests in one file

Aug 29

Posted by lejon in Computing, R | Comments off

Often when a test suite fails you want to re-run just a subset of the whole suite to troubleshoot individual problems. This can be done in R devtools test and testthat by giving a ‘filter’. The filter is a regex matching the filename of the tests that you want to run. Like this:

devtools::test(filter="pdf-un")

Will run only the tests in the file “test-pdf-uncompress.R” (assuming no other files match the filter).

Calculating PI on a Raspberry Pi Spark Cluster

Jul 25

Posted by lejon in Computing | 2 Comments

To give an example on progress in society I will tell you how you can compute PI using distributed computing on Spark and Raspberry PI’s on your own little local computing cluster.

Install Raspian on your Raspberry Pi’s.

Install Java on your Raspberry Pi thus:

sudo apt-get update && sudo apt-get install oracle-java7-jdk

Install ssh on your Raspberry Pi thus:

sudo apt-get install ssh

Fetch Apache Spark to each of your Raspberry Pi’s:

wget http://d3kbcqa49mib13.cloudfront.net/spark-1.0.1-bin-hadoop2.tgz

Also install Spark on your master machine, in my case my Macbook Pro. The current version of Spark (1.0.1) wants all installations of Spark to be in the same folder on all of the machines, so I put them in /usr/local/spark. To be precise, I coped the unpacked spark folder structure to /usr/local and then I made a symbolic link to this folder calling it “spark”.

This is done on all Workers (Raspberry Pi’s) and the Master Machine (Macbook Pro):

sudo mv spark-1.0.1-bin-hadoop2 /usr/local/ cd /usr/local ln -s spark-1.0.1-bin-hadoop2 spark

I then tell the Master machine which IP to export so the workers can connect to the master, this is done by: export SPARK_MASTER_IP=10.0.1.10

I can now run the “Master Start” script on the master (Macbook Pro):

./sbin/start-master.sh

and then start the workers on each of the Raspberry Pi’s:

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://10.0.1.10:7077

I can now submit jobs (for instance to calculate PI using Java) to the cluster on my master machine by:

./bin/spark-submit --master spark://10.0.1.10:7077 --class org.apache.spark.examples.JavaSparkPi lib/spark-examples-1.0.1-hadoop2.2.0.jar

Or using the Python Spark version:
./bin/spark-submit --master spark://10.0.1.10:7077 examples/src/main/python/pi.py 10

I can surf to:
http://localhost:8080

To monitor my cluster.

Raspberry Pi Spark Cluster

We can now calculate pi to:

“Pi is roughly 3.131820”

and it only takes:

116.324641 seconds, now that is progress! 😉

Tags: java, raspberry pi, spark

Convert DataArray taken from a DataFrame to an Array / Vector in Julia

Jun 26

Posted by lejon in Computing, julia | Comments off

julia> DataFrame(CCnt=1:10,Alpha=21:30) 10x2 DataFrame: CCnt Alpha [1,] 1 21 [2,] 2 22 [3,] 3 23 [4,] 4 24 [5,] 5 25 [6,] 6 26 [7,] 7 27 [8,] 8 28 [9,] 9 29 [10,] 10 30


julia> samples = DataFrame(CCnt=1:10,Alpha=21:30)

10x2 DataFrame:

         CCnt Alpha

[1,]        1    21

[2,]        2    22

[3,]        3    23

[4,]        4    24

[5,]        5    25

[6,]        6    26

[7,]        7    27

[8,]        8    28

[9,]        9    29

[10,]      10    30
julia> samples[:CCnt]

10-element DataArray{Int64,1}:

  1

  2

  3

  4

  5

  6

  7

  8

  9

 10

julia> vector(samples[:CCnt]) 10-element Array{Int64,1}: 1 2 3 4 5 6 7 8 9 10

Tags: julia

Add / Concat / append / rbind row to Julia DataFrame

Jun 22

Posted by lejon in Computing, julia | Comments off

In Julia you use vcat to add or append or concatenate a row of data to a Julia DataFrame.

Example:
julia> mydf = DataFrame(X=[0:10],Y=[100:110]) 11x2 DataFrame: X Y [1,] 0 100 [2,] 1 101 [3,] 2 102 [4,] 3 103 [5,] 4 104 [6,] 5 105 [7,] 6 106 [8,] 7 107 [9,] 8 108 [10,] 9 109 [11,] 10 110

julia> mydf = vcat(mydf,DataFrame(X=12,Y=15)) 12x2 DataFrame: X Y [1,] 0 100 [2,] 1 101 [3,] 2 102 [4,] 3 103 [5,] 4 104 [6,] 5 105 [7,] 6 106 [8,] 7 107 [9,] 8 108 [10,] 9 109 [11,] 10 110 [12,] 12 15

Tags: julia

Assign a value in Perl only if a regex matches

May 4

Posted by lejon in Computing, perl | Comments off

Sometimes (especially in one-liners) you want to assign a value only if a corresponding regex (regular expression) that picks out the value matches. I.e if it has once matched you don’t want it overwritten with undef if the regex later fails on a subsequent row in your file.

This can be solved thusly:

$var = $1 if (/Correct (\d)+ %/);

The above snippet will assign $var if the regex on the right hand side matches and picks out a value (via the capturing parenthesis on the right hand side and otherwise leave it unchanged.

Tags: perl

Perl one-liner to calculate an average of some value in a bunch of files

May 4

Posted by lejon in Computing, perl | Comments off

A quick and dirty one-liner (depending on the length of your lines ;)) to calculate the average of a value in a bunch of files in a directory structure.

The below one line picks out a value in each file that matches the name “Logfile*.txt” in the underlying directory structure.

In the below case, the line was in the form of:
Correctly Classified Instances 37 60.6557 %

or
Correctly Classified Instances 37 60 %

The code traverses the directory structure from the current dir and picks out the “60.6557” and sums that over the number of files that matched and then divides with however many files that matched.

find . -name "Logfile*.txt" -exec perl -ne '($var) = (/^Correctly.*\s+((\.|\d)+)\s+%/); print "$var\n" if $var;' '{}' \; | xargs perl -e 'use List::Util qw(sum); print(sum(@ARGV)/scalar(@ARGV)); print "\n";'

OBS: Not very robust!! But it IS a one-liner! 😉

Tags: perl

Index a DataFrame subset on string column name in Julia

Apr 12

Posted by lejon in Computing, julia | Comments off

julia> using RDatasets


julia> iris = dataset("datasets", "iris")

julia> iris[iris[:Species] .== "setosa", :] 50x5 DataFrame |-------|-------------|------------|-------------|------------|----------| | Row # | SepalLength | SepalWidth | PetalLength | PetalWidth | Species | | 1 | 5.1 | 3.5 | 1.4 | 0.2 | "setosa" | | 2 | 4.9 | 3.0 | 1.4 | 0.2 | "setosa" | | 3 | 4.7 | 3.2 | 1.3 | 0.2 | "setosa" | | 4 | 4.6 | 3.1 | 1.5 | 0.2 | "setosa" | | 5 | 5.0 | 3.6 | 1.4 | 0.2 | "setosa" | | 6 | 5.4 | 3.9 | 1.7 | 0.4 | "setosa" | | 7 | 4.6 | 3.4 | 1.4 | 0.3 | "setosa" | | 8 | 5.0 | 3.4 | 1.5 | 0.2 | "setosa" | | 9 | 4.4 | 2.9 | 1.4 | 0.2 | "setosa" |

Tags: computing, julia, linkedin

Nice post on Julia meta operations

Apr 8

Posted by lejon in Computing, julia | Comments off

Great post about basic Julia stuff @ Julia Helps

Tags: julia

Convert a matrix (Array) of Any to a matrix (Array) of floats in Julia

Apr 8

Posted by lejon in Computing, julia | Comments off

convert(Array{Float64},array_of_Anys)

Tags: julia

Print array in Julia with four number of decimals

Apr 5

Posted by lejon in Computing | Comments off

The general to limit the number of decimals is:

println((round(Array,number of decimals))

so to get four decimals:

println((round(Array,4))

Tags: julia

You are currently browsing the archives for the Computing category.
Calendar
June 2026

M T W T F S S

« Aug

1 2 3 4 5 6 7

8 9 10 11 12 13 14

15 16 17 18 19 20 21

22 23 24 25 26 27 28

29 30
Categories

Computing

Evernote

Humor

iStuff

julia

perl

R

Uncategorized
Blogroll

Datamining Blog

IDG

Java Hacker

Linked In Profile

SE Radio

Standingonabeach

TED

The Perl FAQ
Pages

Om

Ted Favoriter
Tags
admin bayes rule biorum boot.ini computing cool crash emacs funny hemtex interesting iphone java julia linkedin linux mac math mörkläggning osx perl perl utf-8 probability quote raspberry pi recovery spark ssh sympathetic system rescue cd unix whining windows
Meta

Log in

Entries RSS

Comments RSS

WordPress.org

Diverse svammel…

Archive for category Computing

Calendar

Categories

Blogroll

Pages

Tags

Meta