Saturday, June 21, 2008

static field shared between instances of generic

The documentation says that one implementation is shared. I guess it makes sense this translates to one copy of the class variables getting shared as well.
public class GenericStatic<T> {
static String o = "";
String oo() {
return o += "o";
}

public static void main(String[] args) {
GenericStatic<Object> a = new GenericStatic<Object>();
GenericStatic<String> b = new GenericStatic<String>();
System.out.println(a.oo());
System.out.println(b.oo());
}
}
$ javac GenericStatic.java
$ java GenericStatic
o
oo

Wednesday, June 18, 2008

R: keeping a 1-d subset of a matrix a matrix

By default, if a subsetting operations yields a one-dimensional matrix (row or column vector), the dimensions are dropped.
> (m <- matrix(1:4, nrow=2))
[,1] [,2]
[1,] 1 3
[2,] 2 4
> (v <- m[1, 1:2])
[1] 1 3
This can cause problems when we want to go ahead and continue using the result in matrix operations:
> v %*% m %*% t(v)
Error in v %*% m %*% t(v) : non-conformable arguments


The solution is to explicitly tell R not to drop the dimensions:
> (v <- m[1, 1:2, drop=FALSE])
[,1] [,2]
[1,] 1 3
> v %*% m %*% t(v)
[,1]
[1,] 52


Something else to look at: removing a column, given its name in a variable: Chris Handorf, subsetting a dataframe

Friday, June 13, 2008

printf("%02d", n)

irb(main):005:0> printf "%02d\n", 3
03
=> nil
irb(main):006:0> printf "%02d\n", 13
13
=> nil
irb(main):007:0> printf "%02d\n", 113
113
=> nil

Use printf to format numbers, filling in with zeros for 1 digit numbers. *
(This is Ruby but I'm marking this as C since that's where I'm most likely to end up using it.)

Thursday, June 12, 2008

get flat output in Maxima

I think I've written before this before but a search didn't turn it up. This just concerns getting "flat", all-on-one-line, non-fancy ASCII output from Maxima. (This is especially useful when the output is destined to be converted into C code.)

(%i8) display2d: false$
(%i9) %e^t;

(%o9) %e^t
(%i10) display2d: true$
(%i11) %e^t
;
t
(%o11) %e

Wednesday, June 11, 2008

Speaking game using python and Mac OS X's say

#!/usr/bin/env python

import os
import random

def say(word):
os.system("say " + word)

class Score:
def __init__(self):
self.total = 0
self.correct = 0

if __name__ == '__main__':
pairs = [["bed", "bad"], ["sped", "speed"],
["wary", "very"], ["bull", "ball"],
["mall", "mole"], ["pale", "pull"],
["soup", "stoop"], ["file", "fill"]]
scores = {}
for pair in pairs:
scores["".join(pair)] = Score()
input = ""

print("To quit, enter q")
while input != "q":
pair = random.choice(pairs)
word = random.choice(pair)
print("1) " + pair[0])
print("2) " + pair[1])
say(word)
input = raw_input("What did the computer say?: ")
input = input.strip()

if input != "q":
key = "".join(pair)
score = scores[key]
score.total += 1
if input == word:
score.correct += 1
print("Correct!")
else:
print("Wrong!")
print("You\'ve gotten the right answer for "
+ " vs ".join(pair) + " "
+ str(score.correct) + " out of "
+ str(score.total) + " times")
Here's a little game to drill you on somewhat similar sounding pairs of English words. I'm sure much could be done to improve it. Is the use of a class here kosher with Python practice? What's contrary to idiom here?

simulate from AR (and general ARIMA)

> RSiteSearch("generate arima") #~~~~~~~($$$)
> layout(matrix(1:2))
> a1 <- ar(y)
> y1 <- arima.sim(list(ar=a1$ar), 9600)
> plot(y1, type="l")
> a2 <- arima0(y, order=c(10, 0, 5))
> y2 <- arima.sim(list(ar=a2$coef[1:10], ma=a2$coef[11:15]), 9600)
> plot(y2, type="l")

Tuesday, June 10, 2008

printing an array in gdb


(gdb) p *(bestSplit->param->mu)@9
$6 = {0, 0, 0, 0, 0, 0, 0, 0, 0}
(gdb) p *(bestSplit->param->p)@(bestSplit->param->numClusters)
$8 = {0, 0, 0, 0, 0, 0, 0, 0, 0}

Just what I've been dreaming of! The syntax: "*" <array name> "@" <number of elements>. See Debugging with GDB -- Arrays

-DBL_MAX

$ cat neg_dbl_max.c
#include <float.h>
#include <stdio.h>

int main()
{
printf("%.50g\n", -DBL_MAX);
return 0;
}
$ gcc -Wall neg_dbl_max.c
$ ./a.out
-1.7976931348623157081452742373170435679807056752584e+308
(This is running under Mac OS X 10.4 on an Intel Mac.) It'd be cool if gcc would recognize this constant and print it instead.

Monday, June 9, 2008

substitute expressions in Maxima

From the Maxima mailing list: Say you want to substitute one expression for another, also dealing with the case in which the expression to be replaced is multiplied by some constant.
(%i1) e : a^2 + b^2 + c^2;
2 2 2
(%o1) c + b + a
(%i2) ratsubst(r^2, a^2 + b^2 + c^2, e);
2
(%o2) r
(%i3) e : - 3 * a^2 - 3 * b^2 - 3 * c^2;
2 2 2
(%o3) - 3 c - 3 b - 3 a
(%i4) ratsubst(r^2, a^2 + b^2 + c^2, e);
2
(%o4) - 3 r

if every vector in a list has the same length, create a matrix

matrixIfAllSameLen <- function(a.list) {
if (length(unique(unlist(lapply(a.list, length)))) == 1) {
n <- length(a.list[[1]])
return(matrix(unlist(a.list), ncol=n, byrow=TRUE))
}
else {
return(a.list)
}
}

Saturday, June 7, 2008

Getting a quiet NaN in C

nan.c:
#include <stdio.h>
#include <stdlib.h>

int main()
{
double x = strtod("NAN", 0);
printf("%f\n", x);
printf("%d\n", (int)(x == x));
return 0;
}
$ gcc nan.c
$ ./a.out
nan
0

This is portable, right?

nan1.c:
#include <math.h>
#include <stdio.h>

int main()
{
printf("%f\n", NAN);
return 0;
}
$ gcc nan1.c
$ ./a.out
nan

Is this portable??

I'm working under Mac OS X 10.4 here. Let's see if these programs work under Red Hat Linux. Yes and no:
vincent% gcc nan1.c
nan1.c: In function ‘main’:
nan1.c:6: error: ‘NAN’ undeclared (first use in this function)
nan1.c:6: error: (Each undeclared identifier is reported only once
nan1.c:6: error: for each function it appears in.)

So, strtod seems to be one way to go. How about that nice sounding nan function?

vincent% cat nan2.c 
#include <math.h>
#include <stdio.h>

int main()
{
double x = nan("");
printf("%f\n", x);
printf("%d\n", (int)(x == x));
return 0;
}
vincent% gcc nan2.c -lm
vincent% ./a.out
0.000000
1
(I'm also working here under Linux. I'm not really sure what string I should drop in for the argument for nan. "hi!" also gives the same results as above. I'll stick with strtod for now.)

NaN

Is it safe to define a NaN using 0.0/0.0? Why doesn't the C standard define a function to return a quiet NaN? (Or does it??? I don't think so.) What do some of the open source implementations of the standard library's math functions do? Plan 9 sqrt.c -- ach, where does NaN() come from?

//

Well, well, well: I should've tried apropos nan right away:

man nan

But I'm still a little unclear on the portability of these. Is it safer from a portability perspective to use strtod?

\\


An article suggesting comparing floats in terms of the signed-magnitude numbers corresponding to the bit representations of the floats! Comparing floating point numbers. Cool!

proportions p such that sum(round(100 * p)) != 100

> for (i in 1:10000) { u <- runif(3); p <- u / sum(u); k <- round(100 * p); if (sum(k) != 100) { print(p); print(k); break} }
[1] 0.03470525 0.45335042 0.51194434
[1] 3 45 51
> c(0.03470525, 0.45335042, 0.51194434)
[1] 0.03470525 0.45335042 0.51194434
> p <- c(0.03470525, 0.45335042, 0.51194434)
> sum(round(100 * p))
[1] 99


> p <- c(0.035, 0.45, 0.515)
> sum(round(100 * p))
[1] 101
> sum(p)
[1] 1
> 1 - sum(p)
[1] 0

Friday, June 6, 2008

partition number according to proportions

/**
* On exit,
* counts[k] = number of points assigned to kth class,
* k = 0:(numClasses - 1) out of numPoints. Ensures that counts[k] > 0
* for every k while also sum(counts) = numPoints.
*/
void proportionsToCounts(const double* proportions,
int numClasses,
int numPoints,
int* counts)
{
int total = 0;
int numEmpty = 0;
int i;

assert(numPoints >= numClasses);

for (i = 0; i < numClasses; ++i) {
counts[i] = (int) round(numPoints * proportions[i]);
total += counts[i];
if (!counts[i]) {
++numEmpty;
}
}

if (numEmpty > 0 || total != numPoints) {
const double remaining = numPoints - numClasses;
int assigned = 0;
for (i = 1; i < numClasses; ++i) {
const int extra = (int) (remaining * proportions[i]);
assigned += extra;
counts[i] = 1 + extra;
}

counts[0] = 1 + numPoints - numClasses - assigned;
}
}
Earlier I had implemented this using only the second loop, now only used conditionally, with difference that I first rounded remaining * proportions[i] before truncating it to an int. However, this caused problems with too many points getting allocated, leading to a negative remainder and counts[0] getting an invalid value. Switching to simply truncating to zero, however, led us to sometimes not get exact results even when they were available. Eg, since (int) (97 * 0.5) = 48, the code under the if statement above divides 100 into (26, 49, 25) given proportions (0.25, 0.5, 0.25) rather (25, 50, 25) as one would hope. Hence, I have added the first loop and delegated the "safer" code to be called only in case of emergencies.

I'm sure there's a better way to do this though, but the above is probably more than good enough for my purposes.

Wednesday, June 4, 2008

Working with dates

Our cat is supposed to receive medicine every day for 60 days and then every other day, starting April 7. Working in Python:
>>> import datetime
>>> start = datetime.date(2008, 4, 7)
>>> d = start + datetime.timedelta(days=60)
>>> d
datetime.date(2008, 6, 6)
>>> d.ctime()
'Fri Jun 6 00:00:00 2008'