Saturday, December 1, 2018

Cassandra - Data modeling



Cassandra is a NoSQL data base. Some core features that are provided by Cassandra are -

  1. High write performance
  2. High scalability
  3. Fault tolerant
  4. Linear scale performance
  5. Easy data distribution

The Cassandra data model contains -
  1. Keyspaces - A keyspace is the container of all data in Cassandra. Replication is specified at the keyspace level.
  2. Tables - Every table should have a primary key, which can be a composite primary key.
  3. Columns - A column contains the key value data in Cassandra.


Cassandra provides CREATE TABLE, ALTER TABLE, and DROP TABLE statements for data definition and provides INSERT, UPDATE, SELECT, and DELETE statements for data manipulation. The ‘where’ clauses in Cassandra have restrictions. Certain filters in the where clause can result in cluster wide scans which are not desirable.

While doing data modeling for your application, some things to keep in mind would be -
  1. Start with what kind of queries you will be performing on the database.
  2. Denormalize when you can.
  3. Data should be spread evenly across the cluster. This can be achieved by picking the right partition key. The partition key is the first element of the primary key. Data is partitioned by the first part of the primary key and clustered by the remaining part.
  4. Minimize the number of partitions that need to be accessed in your read queries.






Friday, August 8, 2014

Solr 4.9.0 Tomcat SEVERE: Error filterStart

Error

SEVERE: Error filterStart

Solution:

For Solr Version  4.9.0

These jars are required in tomcat/7.0.23/lib/ directory-

slf4j-api-1.7.6.jar
slf4j-log4j12-1.7.6.jar

Thursday, February 14, 2013

Java - String comparison

Does it give you a bellyache?
Never mind, you can compare them like this-

System.out.println(gg1.compareTo(gg3));

Saturday, February 9, 2013

Url encode an HTTP GET Solr request and parse json using gson java library

Make a request to the SOLR webserver and parse the json string to Java classes using the code snippets below.
The JSON to be parsed- The corresponding Java classes- Code snippet to make a web server call and parse using gson-

The Java classes must have the same variable names as the Json's keys. Notice how nested Json elements are handled.

Wednesday, February 6, 2013

Errors running builder 'Google WebApp Project Validator'

Error:

Problems occurred building the selected resources.
Errors running builder 'Java Builder' on project 'curve_app'.
com/google/gdt/eclipse/suite/preferences/GdtPreferences
Errors running builder 'Google WebApp Project Validator' on project 'curve_app'.
java.lang.NullPointerException

Workaround solution:

Close all projects, close eclipse.

Start eclipse using, "eclipse -clean" option.

You can try repeating the process 2-3 times,  if it does not work first time.

Not sure of the exact reason why this appears.


Tuesday, February 5, 2013

Highcharts Phantomjs Export - TypeError: 'undefined' is not an object

Generating a png file from config-

phantomjs-1.8.1/bin/phantomjs highcharts-convert.js -infile config.json  -outfile out1.png -width 300 -scale 2.5 -constr Chart -callback callback.js

Error-

Things to check-
1. The following 3 files should be there in same folder or you have to set the paths to these here-
var config = {
                /* define locations of mandatory javascript files */
                HIGHCHARTS: 'highstock.js',
                HIGHCHARTS_MORE: 'highcharts-more',
                JQUERY: 'jquery-1.8.2.min.js'
        },

2. The 'infile' parameter should have an extension of .json, because that is what it uses to decide between svg input and a configuration json.

Friday, February 1, 2013

Monday, January 21, 2013

Hadoop - OutOfMemoryError: Java heap space

I was writing a hadoop job which processes many files and creates multiple files from each file. I was using "MultipleOutputs" to write them. It worked fine for a small number of files but I was getting the following error for large number of files. I tried increasing the ulimit and -Xmx but to no avail.

2013-01-15 13:44:05,154 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.hdfs.DFSOutputStream$Packet.(DFSOutputStream.java:201)
    at org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1423)
    at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:161)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:136)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:125)
    at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116)
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:90)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter. writeObject( TextOutputFormat.java:78)
    at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter. write(TextOutputFormat.java:99)
    **at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write( MultipleOutputs.java:386)
    at com.demoapp.collector.MPReducer.reduce(MPReducer.java:298)
    at com.demoapp.collector.MPReducer.reduce(MPReducer.java:28)**
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:595)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:433)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)


Solution:
I used the following configuration values to resolve it-
OPTS="-Dmapred.reduce.tasks=8 -Dio.sort.mb=640 -Dmapred.task.timeout=1200000"
hadoop jar ${JAR} ${OPTS} -src ${SRC} -dest ${DST}









Tuesday, September 25, 2012

iOS 6 - Apple maps needs to find a shorter route to the correct destination

We all have heard a lot about Apple Maps lately. The verdict is already out. A forgettable job by Apple. The last full update(Iphone 4) had the dropped signal issue and now it is the maps. I usually don't like to get involved in the Apple/Android war but this was one incident, I couldn't resist writing about.
Yes, I have an iPhone. It is an iPhone 4 and I recently updated to iOS 6. I am on AT&T and want to move to T-mobile. I am eligible for an upgrade but I will skip the upgrade this time for reasons I would like to keep out of this blog entry. So on a Sunday morning, when I am finished buying groceries at the nearby Lucky's, the first thing I do is search for a T-mobile store on my new Apple maps and this is what it shows-



So I start driving and I find another Lucky's supermarket at my destination and no T-mobile store anywhere close. That was such a shame.

So tried to search for the same route from Google maps and this is what I get-

Interestingly, the place is in an opposite direction all together.
Then I tried to move the destination pin to the location where Apple maps was locating the T-mobile store and this is what I get-


So it looks like there is not a problem with locating the address alone, but also the route taken. Google maps takes me via a simpler route- a U turn and a right turn but Apple maps goes through a series of twists and turns, getting on the highway, changing lanes etc. The route by Google maps is also shorter 3.2 miles (to the wrong destination of course)

I am not an expert, but I guess Apple needs to map the correct co-ordinates to the expected destination and also debug those algorithms for shortest route possible. It will be a while before I trust Apple maps. Meanwhile, I will use my secondary phone, Nokia N8 for my navigation needs.

On a side note, Nokia 920 does sound appealing to me.




Thursday, September 6, 2012

GWT - Cookie provided by RPC doesn't match request cookie

Even after setting the security cookie on client and server-
bindConstant().annotatedWith(SecurityCookie.class).to("mycookie");

I was getting the following exception-

Sep 6, 2012 4:09:40 PM com.gwtplatform.dispatch.server.AbstractDispatchServiceImpl cookieMatch
INFO: No cookie sent by client in RPC. (Did you forget to bind the security cookie client-side? Or it could be an attack.)
Sep 6, 2012 4:09:40 PM com.gwtplatform.dispatch.server.AbstractDispatchServiceImpl execute
SEVERE: Cookie provided by RPC doesn't match request cookie, aborting action, possible XSRF attack. (Maybe you forgot to set the security cookie?) While executing action: com.company.appmodule.client.Login

Solution: Clear cache and cookies from browser.



Sunday, August 26, 2012

GWT Celltable - Add multiple anchors to a cell



I am trying to insert multiple anchors in a single celltable cell. This is how it would look like-
The cars column has multiple cars, each car link can be clicked to perform different actions.
I have extended a MultipleAnchorCell class from AbstractSafeHtmlCell like this-



And then I use this cell in a celltable to create a column. My User class contains comma separated values of cars in a class variable.

      

       

I am still learning GWT and it took me a while to figure this out.

Comments/feedback is welcome.
===============================================================

Saturday, April 7, 2012

GWT dygraph example using visualization datatable

Here is a quick example of how to create a dygraph using GWT and visualization api to get the datatable.

1. Add dygraph-gwt.jar to your build path and these inherits tags in your mysamplegwtproj.gwt.xml file

<inherits name="com.google.gwt.visualization.Visualization"/>
<inherits name="org.danvk.dygraphs"/>

2. Add the following script tag in mysamplegwtproject.html file

<script type="text/javascript" src="dygraph-combined.js">
<script type="text/javascript" src="http://www.google.com/jsapi">
3. Add the following code in your onModuleLoad() function-



        final VerticalPanel vp = new VerticalPanel();
        vp.setWidth("700px");
        vp.setHeight("700px");
        VisualizationUtils.loadVisualizationApi(new Runnable() {
            @Override
            public void run() {
                Query q = Query
                        .create("mysampleproj/getMyDatatable?params=2&id=1");

                q.send(new Callback() {
                    @Override
                    public void onResponse(QueryResponse response) {
                        JavaScriptObject jdygraph = createDygraph(
                                vp.getElement(), response.getDataTable(), 0,
                                100);
                    }
                });
            }
        }, LineChart.PACKAGE);

        RootPanel.get().add(vp);

4. Here is the JSNI function which actually creates the dygraph.


public static native JavaScriptObject createDygraph(Element element,
            DataTable dataTable, double minY, double maxY) /*-{
        var chart = new $wnd.Dygraph.GVizChart(element);
        chart.draw(dataTable, {
            valueRange : [ minY, maxY ],
            displayAnnotations : true,
            showRangeSelector : true,
            legend : 'always',
            labelsDivStyles : {
                'textAlign' : 'right'
            },
            title : 'TITLE',
            titleHeight : 25,            
            axes : {
                x : {
                    pixelsPerLabel : 50
                }
            }            
        });
        return chart;
    }-*/;

Hope this helps. Suggestions/improvements are welcome.

===============================================================
 

Wednesday, April 4, 2012

NoClassDefFoundError: com.google.common.collect.Sets


Error-
NoClassDefFoundError: "com.google.common.collect.Sets"

Solution-
Add guava-11.0.2.jar to WEB-INF/lib and add it to the build path.

Wednesday, March 21, 2012

GWT wrapper for visualization treemap with mouse events

A treemap is a helpful visualization of a data tree. It would be nice if the treemap had features like weighted box sizes for intermediate levels and explicitly setting up colors for non-leaf nodes. I wrote a  quick Google Web Toolkit(GWT) wrapper for Google visualization treemap-

 public class TreeMap extends Visualization<TreeMap.Options>
{
   public static class Options extends AbstractDrawOptions {
       public static Options create() {
           return JavaScriptObject.createObject().cast();
       }

       protected Options() { }
   }

   public static final String PACKAGE = "treemap";

   public TreeMap() {
       super();
   }

   public TreeMap(AbstractDataTable data, Options options) {
       super(data, options);
   }

   @Override
   protected native JavaScriptObject createJso(Element parent) /*-{
       return new $wnd.google.visualization.TreeMap(parent);                
   }-*/;
   
   public final void addOnMouseOverHandler(OnMouseOverHandler handler) {
       Handler.addHandler(this, "onmouseover", handler);
   }
   
   public final void addOnMouseOutHandler(OnMouseOutHandler handler) {
       Handler.addHandler(this, "onmouseout", handler);
   }
}

Tuesday, March 20, 2012

Postgres script - for loop two dimension array using array upper and lower


A basic script to loop over an array in psql-


CREATE OR REPLACE FUNCTION func1(n character varying, v character varying)

  RETURNS integer AS

$BODY$

    DECLARE   

       return_code integer;

    BEGIN

       RAISE NOTICE '(%,%)', n, v;          

       return_code := 1;

    RETURN return_code; 

    EXCEPTION 

        WHEN NO_DATA_FOUND THEN           
           RETURN -1;
    END;

    $BODY$

  LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION func2()

  RETURNS integer AS

$BODY$

    DECLARE      

       return_code integer;

       pairs varchar[][] := array[['key2','val2'],

                            ['key1','val2']];

    BEGIN

        FOR i IN array_lower(pairs, 1) .. array_upper(pairs, 1)

        LOOP

         --RAISE NOTICE '%,%',pairs[i][1]::varchar, pairs[i][2]::varchar;

           PERFORM func1(pairs[i][1]::varchar, pairs[i][2]::varchar);

        END LOOP;

    return 1;

    END;

    $BODY$

  LANGUAGE plpgsql;

SELECT * from func2();

Saturday, October 15, 2011

Problem - GWT logging not working

When you inherit certain modules in .gwt.xml file, it disables GWT logging. This happened to me when I tried inheriting Requestfactory. Some modules inherit com.google.gwt.logging.LoggingDisabled  internally and thus affect your logging.

To fix this you should explicitly enable logging after you have inherited all your modules. At the end of your list of inherits add the following to your .gwt.xml file-

<set-property name="gwt.logging.enabled" value="TRUE"/>

Executing Postgres crosstab query as a prepared statement


I was executing a crosstab query as a prepared statement in Java(in a GWT app) and getting the following error -

PSQLException - Can't use query methods that take a query string on a PreparedStatement.

With some helpful folks from stackflow, I was able to resolve the error with the following code -

String query = "SELECT * FROM crosstab(
                      'SELECT rowid, a_name, value
                       FROM test WHERE a_name = ''att2''
                                    OR a_name = ''att3''
                      ORDER BY 1,2'
) AS ct(row_name text, cat_1 text, cat_2 text, cat_3 text);";

PreparedStatement stat = conn.prepareStatement(query);
ResultSet rs = stat.getResultSet();

//Note that, it is executeQuery() and not executeQuery(query)
stat.executeQuery();
rs = stat.getResultSet();
while (rs.next()) {
    //TODO
}





Thanks!

PostgreSQL crosstab query - Rotate a table about a pivot

An interesting feature of relational databases(postgres in this case) is the ability to rotate the table about a pivot. So if you have data like this-
 id | rowid | key | value
---+------+----+-------
  1 | test1 | key1      | val1
  2 | test1 | key2      | val2
  3 | test1 | key3      | val3
  4 | test1 | key4      | val4
  5 | test2 | key1      | val5
  6 | test2 | key2      | val6
  7 | test2 | key3      | val7
  8 | test2 | key4      | val8

And want to have a result set like this -

rowid | key1 | key2 | key3 | key4
------+------+-----+-----+------
 test1  | val1  | val2  | val3  | val4
 test2  | val5  | val6  | val7  | val8


It can be achieved by a "crosstab" query in a postgres database -



Update: 02/14/2013
The following can be achieved by a crosstab and also by an interesting SQL which I came across here - http://stackoverflow.com/questions/14863985/postgres-crosstab-maybe


Thursday, September 15, 2011

Virtualbox-Windows 8 developer preview installation error

Downloaded the Windows 8 developers preview yesterday but only to find I could not get it installed on a Oracle Virtualbox. I tried both 32-bit and 64 bit version on a Win 7 64 bit machine(HP EliteBook 8440p). The 32-bit got stuck at the following screen. I need to look for help on this error 0x81B8C63B


EDIT: From "Smiley"'s comments below, after I turned the Virtualization option ON from the BIOS menu at startup, I was able to install windows 8. Thanks.
Surprisingly, I have a Ubuntu virtual machine and it runs fine without turning on the Vitualization option.

Thursday, March 17, 2011

Wxpython snippet - A date time control



Make sure you have installed wxwidgets.



# Create date control
self.dateCtrl = wx.DatePickerCtrl(panel, -1, pos=(130, 70))

#create time control
self.timeCtrl = wx.lib.masked.timectrl.TimeCtrl(panel,display_seconds=False,
                                               fmt24hr=False, id=-1, name='timeCtrl',
                                               style=0,useFixedWidthFont=True,
                                               value=datetime.now().strftime('%X'), pos = (250,70))

To get values -

self.timeCtrl.GetValue()

self.dateCtrl.GetValue())