RSS

Monthly Archives: November 2011

HTML5 and CSS3 Readiness

It show a cool graphical visualization of HTML5 and CSS3 readiness in most of the major browsers we use.

I guess the technology used for visualization could  of interest to those who are working in “data visualization”

To see the visualization open this link: http://html5readiness.com/
html5cssreadiness

 
Leave a comment

Posted by on 27.11.2011 in html5, Visualization

 

Tags: , , , ,

Gapminder Desktop

A free tool for animated statistics:

Example of presentation using Gapminder (Population growth explained with IKEA boxes).

 

Tags: , , ,

Aside
 

Citation
The act of citing other authors work is the practice of most computer scientists. These practice helps us detects and explain the community effect. It aids in drawing conclusion to justify progress and gather information funding as well as to identify trends and patterns in evolving field overtime. This can be used as guidelines by donor or funding agencies and tenure communities in making more inform decisions.

 http://scientific.thomson.com/free/essays/citationindexing/history/     www.adb.org/knowledgesolutions

Kwame

 

 Citation The …

 
1 Comment

Posted by on 25.11.2011 in Uncategorized

 

Tableau Public

Free tool for interactive data visualization with opportunity to embed it in a website.

 

Tags: , ,

Link

Document Clustering using Compressive Sampling:

Part of the lecture form University Melborne.

http://videolectures.net/ecmlpkdd2011_park_fast/

 

My topic:

Overview of what I learned by now

There is 4 parts:

1.Hadoop

Hadoop is basic framework of distributed systems.Hadoop deals with a large number of data in a reliable, efficient and scalable way. It’s reliable because it assumes that the computing and storagement of elements will fail,therefore it maintains multiple copies of data, ensures that the failed nodes can get re-distributed process; It’s efficient because it works in a parallel way, it speeds up through parallel processing. And hadoop is scalable, the user can develop a distributed application without knowing distributed low-level details very well.Take full advantage of the power of high-speed computing clusters and storage.

2.MapReduce

MapReduce is a programming model for parallel computing of a large of data processing. There are two concepts: Map and Reduce. The main idea of mapreduce is from functional programming language and vector programming language. It makes it so convinent for programmers who don’t know distributed parallel computing to run their programs in a distributed system.

3.HDFS

Hadoop Distributed File System, HDFS is high fault-tolerent system.And designed to be deployed in low-cost hardware.And it provides high throughput toaccess the application data, for those with large data sets applications.HDFS is a important part of hadoop project.It was developed for the basic structure of open source apache projects.

4.HBase

HBase is a distriduted and column-oriented open source database.The technology comes from the Google paper “Bigtable: a structured distributed storage system” written by Chang et al.Just like Bigtable uses the distributed data storage provided by Google file system, HBase provides the ability which is similar with Bigtable for Hadoop. HBase is a sub-project of Apache Hadoop. HBase is not like the normal relational database, it’s a database for unstructured storage.Another difference is it’s column-based not line-based model.

 
1 Comment

Posted by on 21.11.2011 in Uncategorized