Thursday, February 28, 2008

Tag cloud search

Quintura has developed an interesting search engine - tag cloud search. You will not need to type in much when you are searching for something, you just have to click.

It comes as a widget. So you can easily add it to your web site too.


Updated on 3/19/2008:
There are numerous complains about Facebookster's business. No official words from the company that deny those allegations. No reliable source that confirms this is a "spammer" or "scammer" business. Until then, I suggest everyone stay away from the company.

See my comments post.

Facebookster is a Facebook application development company. Its business objective is to "make money with Facebook". The company has built a handful of FB applications.

According to its Web site, the company employs 100+ developers and designers and provides Facebook and social network related consulting to Fortune 100 clients.

The company published an essay on "How to make money with Facebook". In which, it talked about seven ways to achieve this objective:
  1. Selling ads
  2. Sponsorship
  3. Sell goods and services
  4. Write a Facebook book
  5. Write a Facebook blog
  6. Develop Facebook apps as a consult
  7. Sell your Facebook app
I wonder if FB is now officially a Web platform that holds the power to create and destroy Web ecosystems -- it can nurture new startups and disrupt existing business models?

Can you imagine what it will like if FB went down for 24 hours? Will it have a similar impact as if Google or Yahoo! went down for 24 hours? Which is worse?

Wednesday, February 27, 2008

Social Networks for Conferences

One very important reason people go to conferences is to meet people working in their field and establish connections. In this web 2.0 age it is no wonder that a social network creation tool like CrowdVine would provide ways to tailor its creations to conferences.

Google Social API

Google's API for social network was released a while be now and Digital was currently has an article introducing developers to the API.

Its a brief introduction, but covers some initial points you should know. Digital Web usually doesn't run in depth articles or tutorial.

Sunday, February 24, 2008

The Semantic Web is expected

I was helping out a friend setup the Windows Vista Parental Controls this weekend. They wanted to keep the kids out of the "naughty" Internet. They had setup Content Manager at first, but that was out of control in its cluelessness and uselessness. They didn't grok the fact that a filtering system couldn't just understand what was on the page and block things nice and easy. The Vista Parental Control filter worked out a lot better than Content Manager. Today filters can block everything and allow some, or allow everything and block some. Problems always result from these systems, block legitimate websites. The promise of a semantic web will enable filters to actually function in the future, by giving actual meaning to web pages, instead of random words, interpreted by filtering systems, often incorrectly.

The Personal Semantic Web

TheBrain has been in development since 1996. I saw an earlier version a few years ago but never got around to setting it up. The system allows one to organize ideas, thoughts, files and other items, linking them all together for easy management. Instead of using folders to create cascading containers of information, the system is more akin to a set of symbolic links, allowing you to create links between different data points in a more natural way. You can assign realworld meaning to your data and get at it in a more useful way. This all of course requires management on the user's part, but the tools available with the system seem powerful enough to scale nicely.

List of TheBrain common uses: TheBrain

Taglocity Transitions Tagging To The Enterprise

In researching some of the inroads tagging has made in the enterprise setting, I came across Taglocity, a plugin to Microsoft Outlook which brings tagging and filtering to the popular email client out of Redmond.  Check it out here if you're interested:
Just to highlight some of the interesting features:
-AutoTagging - If this feature works well, as advertised, then thats a huge breakthrough in tagging.  That makes tagging much less like folders, and much more of a new paradigm.
- But, while we're at it, its tags replacing folders!  I don't really want to change the structure of how my messages are stored.  And sometimes I just want a flat view of everything.  Folders don't really get you here, tagging is a much more flexible solution.
-  Tag Sharing - So now you can also send a tagged message off to someone else.  I have mixed feelings about this.  On one hand, this can be an awesome way to make sure spam doesnt get into your "Important Business Meetings" collection of emails.  On the other hand, I'm sure 5 minutes from now some spammer is going to figure this out, and I'd wake up to find that a certain "enhancement" product was a sure-hit for my "Family Engagements" category.
For a while I felt like "Oh, tagging, great, we've got a new way to say 'Folders' now, awesome!"  But this is one of many products coming out that sort of redefines the ways we're going to use tagging, and I think tagging is starting to break out as a really useful tech paradigm. 

Tagging online ads for web analytics

Most of the major web analytics tools including Google Analytics, omniture, websidestory use tagging. Avinash Kaushik writes more about Web analytics. Tagging helps such analytics applications to successfully track different aspects of online marketing activity. It uses link tagging which involves adding query string parameters to the destination URLs used in online ads. Google Analytics is integrated with Google AdWords which provides option for auto-tagging. And to tag other paid search URLs Google provides a URL builder tool . Tagging search campaigns has become very critical to ensure accuracy.

Searching/Tagging in Enterprises

There is a 2006 ACM Queue article[1] that discusses the evolving needs of searching enterprise data sets. An obvious problem that continues to drive the need for search is the growth rate of data produced by enterprises. The article talks about building metadata catalogs and using them to add search capabilities into enterprise applications. They then discuss providing a platform to allow business users to define and create the catalogs. The mention tagging, and while only briefly, they say how it could be leveraged to allow users to dynamically add to or update the catalogs without requiring an "administrator" for the catalog.

1: R. Barrows and J. Traverso. Search Considered Integral. ACM Queue, 4(4):30-36, May 2006.

Tag Hate

Someone made a series of blog posts about his dislike of tag usage. Among his reasons he talks about having to know the exact phrase in order to get results. He also calls to attention the different ways different people like to organize data. Its an old post, but do you guys think that his issues with tags are still valid today? Or are his claims simply an expression of his own personal preference against tagging?

Friday, February 22, 2008

visualizing tags

We've been talking a lot about tagging recently. Even if you are a big proponent of tags, they won't do you any good if they aren't displayed so that you can explore them.

First, there is the ubiquitous tagcloud.

But tagclouds aren't the only way to explore tags. Some other examples:

Delicious Soup

Fidgt - uses "Tag Magnets" to find other users with the same tags.

Mind My Map

Images of Mind My Map and Delicious Soup are from Visual Complexity

Notorious: Enterprise social bookmarking

Notorious is a an application for enterprise social bookmarking.Employees can share their personal bookmarks with others in the team or project. Notorious is designed for use specifically within a company. You can tag any item on the internet and intranet. It also allows you to find the most popular topics that are being bookmarked. I think data generated by this system may be used used to monitor news/gossips, finding most popular intranet pages, connecting with people who share your interests.

The Future of Tagging

Search techniques will eventually replace many uses for tagging. Tagging documents by their topics will be replaced by IR search. Tagging pictures might be replaced by imaging technologies and techniques such as face detection. So with text and images covered by other developing technologies where will the future of tagging be?

Human tagging will be needed for problems that computers can not easily process. I see this as primarily human evaluation. Humans can make value judgments (Good/Evil, Happy/Sad, Positive/Negative) that machines are incapable of making without human guidance. Personally I see tagging being applied by people to people. I can envision people tagging other's blog pages with their own value judgments, as the primary form of tagging.

The Good, the Bad, and the Hammer.

Last class we had our debate on whether or not social networks are good for society. We had three categories: good for society, bad for society and undecided. However, I feel that we needed another category: Neither good nor bad for society.

Social networks are a technology, just like the hammer is. Is the hammer good or bad for society? Hammers can be used to build houses or to smash in windows... not to mention all the cheesy murder mysteries on the tube involving a bloody hammer.

Obviously the hammer is a much simpler technology than a social network is but I am of the opinion that both technologies are of the same fundamental nature; as technologies, they are both tools used to carry out a task. As such, both hammers and social networks can be good or they can be bad.

In class our debate was less about the technology of a social network than it was about what social networks are currently used for and at one point the 'Wild West' analogy was made but we did not really run with it. Where I work we have a social website that is only accessible from inside of the building. We use it to escape from lotus notes email (forums, hurray!), schedule conference rooms, leave work related messages as well as figure out who has certain skills necessary for a specific task. The issue of privacy is completely removed from the equation. This network speeds up work place efficiency by spreading important news effectively and keeps itself from being a time sink since it is used only for work related information. If the privacy issues of Facebook and MySpace equate to an application of technology that is bad for society then surely the application of the technology at my work location is an example of a good application. Our debate in class also followed a similar approach. An example of a good application, like organizing people for a rally would be given then followed/countered with an example of a bad application, like an internet stalker.

If the current status of social networks on the web is equivalent to the rough, lawless times of the Wild West then the enterprise level, closed social network where I work can be likened to the Sheriff rolling into town and law and order taking a foothold. Just like the Wild West was developed and settled, so will social web technologies. But, like any tool or technology, there will always be good and bad applications depending upon how people choose to use them.

Social network for corporate travelers

Sabre has announced a social networking site that enables travel advice and recommendations to be shared among the colleagues.

It gives us an idea about how Web 2.0 could be used to share information with others. It has enabled us to share information that was rarely available otherwise. You can rely on information that you get from people you know rather than relying on some other source.

Wednesday, February 20, 2008

Microsoft wants YOU...

... to code in their languages. In an effort to gain some market share from the amateur developer world, Microsoft is courting students with their free offering of many major Microsoft development tools including Visual Studio 2008, Expression Studio, and XNA Game Studio. With tools like Eclipse and other free IDE's, users have little incentive for using Microsoft's Visual Studio for anything outside of .NET development. As Microsoft sees the world of development opening up, perhaps they are seeing their development audience migrating toward tools that embrace open rather than exclusivity.

Is this Microsoft's attempt at becoming more "open"? Will we see student's flocking to download Visual Studio or Expression Studio for development? I've used Visual Studio as the primary development tool through most of college and at work from time to time. I certainly see the value in having it, but is this enough to steal users away from Eclipse and other open/free IDE's?

Tuesday, February 19, 2008

Categories and Tags

In the debate for using tags or folders, tags have an obvious advantage that an item can belong to multiple tags. For example an article can be tagged as "Politics+Economics"because it has content related to both.A similar concept a categories.Categories are not tags but serve similar purpose.Categories are more rigid than tags and at the same time can help usersof your site find related information.

Categories and Tags

Sunday, February 17, 2008

An example of the social web gone bad

An AP seed over at Newsvine offers some chilling reality when it comes to personal information an the web. The article got me thinking about how easy it can be to discredit someone by rumor. Now that some employers google/search social networking sites for prospective new hires, what prevents someone from gathering a couple flikr pictures and creating a fallacious profile?

Semantic Web Should Improve Search

So I got the flu the other day. It was probably passed to me via my elementary school-age kids who also had a mild case. The weird thing is, three of us each acquired an unusual eye twitch, all starting within a day or so of each other. I was curious if this might be a common symptom of the flu, maybe related to body aches or something. So I did some Internet searching e.g. using 'flu "eye twitch"' and did not find anything in the first page of results that associated the two conditions. I did find lots of pages where eye twitch was discussed, and flu was mentioned elsewhere in the page (often in an advertisement). I pondered my search failure, and accepted that popular search engines using term co-occurrence always run the risk of returning irrelevant content. What would really improve search is the clue that flu and eye twitch are both medical conditions, and my query is whether one condition is related to another, e.g. via a symptom-of relationship. On subsequent searching I did find two blog posts (Eyelid Twitching and Eye Twitch) that each discuss associations between flu viruses and eye twitches. While I've still read nothing official, there is apparently some anecdotal evidence out there to support the hypothesis that a flu virus may cause eye twitches. I look forward to the day when search engines will recognize mentions of e.g. medical conditions and domain-specific relations in text, and when the corresponding search interfaces offer users ways to specify semantic queries.

Semantic Web Case Studies

I read The Semantic Web In Action Scientific American article Prof. Chen recommended to us. I've been particularly interested to learn more about whether and how businesses are benefiting from the semantic web and/or semantic integration efforts. I want to point out that the article cites the W3C Semantic Web Education and Outreach Interest Group which has compiled a number of case studies and use cases. It will take some time to go through the case studies in detail, but they clearly represent a valuable resource for understanding what about the semantic web is working and what challenges remain. - tagging your musical life is a web 2.0-themed site which tracks the digital music you've been listening to over the course of the past day, week, and life of your account.  The site offers up a desktop application to communicate with the website every time you play a complete track.  But where it becomes interesting is in the extensibility of the data you're providing.
On the surface, the site provides a clean interface for other people (your friends, people who read your blog, strangers) to see and sample what music you've been into lately.  The site also syndicates what you've been listening to in an rss feed, which means you're able to plug that data into other sites like facebook, or also into your own websites or web projects.  Also, the site features a friend feature, which not only lets you link to people with similar tastes, but also compares your "musical compatibility" with another user on the site.  The concept seems really intriguing:  What if one day, some company like eHarmony decided to add a integration that tested compatibility based on similarity of music tastes?  I mean, I could probably live with someone loving a band like Fall Out Boy, but what if my library was littered with artists like Lupe Fiasco, Saul Williams, Common and Kanye West, and they set me up with someone who thought the end-all, be-all of rap was Eminem?  Disastrous!
The site is an interesting example of what I'd term a back-end data-driven service, where the frontend site is certainly worthwhile, but the backend data being collected is where the real value and substance is.  
In the interest of driving more attention, you can check out what i've been listening to at  And hey, if you're already on the site, sign in on the comments and lets get a feeling for how musically connected our class alone is!

Advertising on the Web

Advertising on the web has come a long way. What started out as simple banners that would appear at the top or bottom of web pages have evolved into many new forms. The most annoying is the talking animated advert. Reminiscent of a TV commercial-in-a-box, thankfully most of these have a mute button to silence them.

Then there are the "shoot-the-monkey" type ads. Basically you click on this moving monkey or something, and then it tells you you won a prize since you were able to click on it. But it actually ends up forwarding you to some site where they want you to buy something. Susan Kim (creative director at has an interesting article ( describing how the appropriate use of "rich media" (essentially anything beyond a simple .gif or .jpeg) can increase clickthrough rates by up to 14,000% or more. But do it wrong, and you'll simply turn away potential clickers.

It's a constant struggle between the advertisers and the consumers viewing the sites. Consumers don't want to deal with ads, and will use things like Firefox's AdBlock plug-in to automatically hide ads on pages. But advertisers want consumers to click their ads, and possibly end up spending money at their advertiser's sites as a result of that click.

Honestly, I find most forms of advertising annoying. But some ads are actually fairly clever and even somewhat useful/relavent. Like when I do a Google search for "National Rental" (trying to find the home page of National Rent-a-car, the first thing that shows up is a "Sponsored Link" to National Car Rental (essentially an ad for National that they paid Google to automatically place when certain search terms were entered). This is useful, since I was trying to navigate to the National web site anyways, and the ad provides a useful link to the place I was trying to get to.

Advertisers need to find that happy balance between "my head hurts from looking at this advertisement" and "I didn't even notice that, it blended right in." I think they are making progress, but as Web technologies evolve, I'm sure it will be an ever-changing field, especially since things web technologies advance so quickly.

Web Search Aggregators?

Back in the day, before Google had pretty much conquered the web search market, there were many search engines out there for finding web content. Some, like AltaVista, HotBot and Excite made use of web spiders for locating and indexing new content. Others such as Yahoo maintained a staff of humans that would maintain a directory of sites in various categories and allow users to search it.

And there were others, such as (now just, that made use of advanced natural language processing algorithms to attempt to answer your question intelligently rather than simply pattern matching the fulltext of the web for your search term. Ask Jeeves was especially nice, because it felt more like you were able to intelligently ask a question and get a response, such as "Where can I find the exchange rate from dollars to GBP?" None of the other search engines really were able to do that.

One other site that stuck out was It didn't actually maintain an web index of its own, but instead, it would send out your search query to all the major search engines, rank the results, and then show an aggregated list of potential hits. Before I started using Google, MetaCrawler was my default search engine for pretty much any query (with being a close second for things that could be formed into a question easily).

It appears that MetaCrawler is still around, still searching, as are most of the other big players. But does anyone actually use them anymore, with Google as the golden standard of web search? Will some new player enter the field with some amazing new technology that could challenge Google's dominance? I'm looking forward to that day, even if it simply means that it will force Google to innovate once again.


Xfire is an instant message client with one unique addition over other clients: it can be used inside of a pc based video game. The client is switched on before a game is played and can detect when a game is being launched. This allows it to display what game you are playing to all of your friends, as well as track your game stats for later viewing online. Also, making an Xfire account gives you access to the forums and profile system. It's an interesting application of social web technology as there is no need to visit the website once you have downloaded the client, yet you are still part of the network. Lastly, there is no reason to keep playing a game just to stay in touch with the people who you used to play it with, Xfire works perfectly well outside of a pc game and has even been introduced to GAIM (now Pidgin) through an extension although this extension does not seem to be updated on a regular basis.

Amazon's Mechical Turk

Amazon has a service called Mechanical Turk, or mturk, named after a chess playing 'machine' that turned out to be a human. The service allows you to post jobs that are easy for humans but difficult for computers. The service, sometimes called "crowdsourcing", typically pays each person completing the task a small amount.

An article on discusses the mturk service. Opponents have called it a "virtual sweatshop". Rebecca Smith, a lawyer for the National Employment Law Project, sees mturk as

"...just another scheme by companies to classify workers as independent contractors to avoid paying them minimum wage and overtime, complying with non-discrimination laws, and being forced to carry unemployment insurance and workers compensation."

Probably the most well known example I found was an art project "The Sheep Market". As part of his master's thesis Aaron Koblin submitted a request for 10,000 drawings of sheep facing left at $0.02 a piece.

To test out mturk I drew several pictures for's next art project. I earned $0.04 in Amazon credit.

Semantic Web and Natural Language Processing

I think there is great partnership potential between semantic web's goals and natural language processing (NLP). NLP wants machines and software to be able to read, understand, and create human language. Semantic web wants machines and software to understand what is written on web pages. So really semantic web is what you do after you have conquered NLP. You would then have literate bots "read" the internet. So you can enter questions into search engines rather than keywords.

I'm not the only one with this idea of course. Powerset is actually betting money on it. They working on a semantic search engine powered by NLP.

tagging to improve image search

Using user generated tags for image search had been around for a while. Flickr has been using tags to find images for several years. Google managed to turn tagging images into a game. Pairs of people are given a set of images and asked to think of labels for the images. Points are awarded for labels matching your partner's. The labels gathered from the game are used to improve Google Image Search.

Google Image Search isn't the only situation in which user generated tags can improve image search. In a project described here, several prototypes were developed and tested for museums to gather tags to improve access to their online art collections.

In this example, users were asked to provide tags to describe the statue pictured.

Some museums have already implemented similar systems. The Cleveland Museum of Art's online collection makes it easy to tag images by placing a button labeled "Help others find me" next to the image of the art piece (see example). Unfortunately, there doesn't seem to be a way to view the tags that other users have submitted.

While these concepts aren't new, I think that we're going to start seeing more and more websites using user-generated content to improve search.

Yahoo stops innovating? has reported that Yahoo's Design Innovation Team, known as yHaus, has been let go and shut down. Over a thousand employees have been laid off due to the shuffle over at Yahoo, but are layoffs the answer? Innovation is what started Web2.0 and innovation is what is driving it. I guess yahoo will now go the Microsoft route and just buy innovation that they deem to be potential money makers...


AbsentEye is a new website that aims to improve the lives of students who have trouble asking other students for help. Students register with the website, and can then begin posting and downloading notes for free. For quality control users are able to rank notes posted by others. Does this have potential as it is? Or is it missing something?

Saturday, February 16, 2008

Web 2.0 is driving the IT industry

This news is an evidence of the fact that Web 2.0 has the ability to drive the industry. Sun has decided to help companies with virtualization tools. "Sun MicroSystems wants to equip the next generation ofInternet companies with its hardware and software and will offer virtualization products to help them keep costs down, make their data centers more flexible, and give developers multiple development environments".

As people keep using different platforms, they will want to use more and more of Java code. This seems a fairly clever move to me. Also, now they have MySQL, so all PHP-MySQL fans also have to go to Sun.

CEO Jonathan Schwartz's statement - "but I prefer to focus on acquiring new customers, not on the competition" confirms the fact that they also want more users to use their products and eventually become their customers some day.

Amazon Web Services went down

Amazon Web Services that provide web-scale computing infrastructure to many other companies were down over 30 minutes yesterday morning. A number of websites that depend on Amazon's S3 storage, and EC2 compute cloud were affected. Eric Schonfeld(TechCrunch) comments on the reliability of cloud computing business:
"Nobody is going to trust their business to cloud computing unless it is more reliable than the data-center computing that is the current norm. So many Websites now rely on Amazon’s S3 storage service and, increasingly, on its EC2 compute cloud as well, that an outage takes down a lot of sites, or at least takes down some of their functionality. Cloud computing needs to be 99.999 percent reliable if Amazon and others want it to become more widely adopted."
This raises an issue related with usage of storage services like S3, one should never build architecture that requires high availability access of such a service. Further, it's a warning signal to such storage service providers to incorporate architecture with no single point of failure.

Friday, February 15, 2008

Yuwie is a social website that pays its users. It has all the features we've come to expect from a social website with the addition that users are paid a certain amount each time their profile is viewed. Payment can also be earned through referrals, with each referral's earnings adding to the referrers earnings. Do you guys think this is a good model for revenue sharing amongst websites that depend on user generated content?

The Semantic Web - Not Quite Machines Only

The semantic web was designed to allow machines to function on the web independent of human beings. Semantic web ontologies provide definitions for terms that machines can understand, but if a machine needs a definition for something that it has encountered how does it find a document with this term?

UMBC's Swoogle is a semantic web search engine designed to retrieve semantic web documents for queried terms. Yet even if a robot could go to Swoogle and retrieve a set of semantic web documents how does it know which one has the best definition for the term it needs?

I've been working on a way to solve this problem using standard AI state space search algorithms to find the best definition for a term based upon terms it occurs with. Since you need definitions for all terms you should try to find ontologies which define as many of your terms as possible that way you can cover the terms with the fewest semantic web documents.

Automatically Tagging Information

Reuters has a service called OpenCalais that takes user submitted content and adds semantic markup to it. The site offers an API that web services can use to access the service. The Overview page  is an interesting read. I think this is a good step towards making more semantically-tagged information available.

Thursday, February 14, 2008


Folksonomy i.e. user generated tagging has some flaws that could be avoided by following certain simple techniques as pointed out by this site. Misspelling is one of the reasons listed but then I think why cant the system alert the user adding a tag that he has misspelt the word.Same is the case withusing plural forms (apples instead apple).

We need to Standardize License Agreements

After reading this post, I thought that we should standardize privacy policies and license agreements. Let me explain what exactly I mean by that.

In class Dr. Chen expressed the need of having information in a machine readable format. For this purpose we mainly use standards like FOAF, OWL etc. Companies are supporting development of such standards because its in their interest to have all data on Web in machine readable format. They can make money out of it.

Similarly, why shouldn't we have a standard for license agreements and privacy policies? E.g. a site like Orkut or MySpace should publish usage and license information in some standard format.


<company > Google </company >
<application > Orkut </application >
<profilemode > public </profilemode >
<informationlifetime >forever:) </informationlifetime >
<othertags>value </othertags>

This way, I will be able to decide whether I want to join a particular site or not. I will use some automated "Software Lawyer" that tells me the risk associated in joining a particular site.

But I am sure that these companies won't have such a thing because they all want to use our information in abnormal ways and they want to hide this fact using those 10 page license agreements.

What do you guys think? Or does there exist something similar to what I just proposed?

Wednesday, February 13, 2008

Social Wallpapering?

I can understand the desire to start a social web site that offers a neat service or tool, and make money while doing so. It is something that interests me.

I would consider "social web" to be more of a "Web 2.0" label. Forums, communities, and sharing sites of "Web 1.0" are by no means out dated now. When I think of a forum or sharing site from "Web 1.0" I think more of the term "community" and when I think of Facebook, MySpace or Flickr I think "Web 2.0" and "Social Web."

So when I discovered "Social Wallpapering" I got confused. It seems that it will be a wallpaper like
stock.xchng and Caedes. Despite the logo, Ajax buttons, and underdeveloped state, I refuse to call this social web. Its also hard to take this site seriously since currently the wallpapers are going up with no guarantee that the submitter has the rights to that artwork, mainly becuase this "community" site doesn't even have the user registration set up yet.

Tuesday, February 12, 2008

Facebook = Blackhole?

Is Facebook A Black Hole For Personal Info? Apparently if you delete your profile, your profile not only still exists but is accessible to outside users. A dude tried to delete his Facebook profile, found out the information was still available on Facebook servers and started an email campaign to have Facebook remove his information. Facebook emailed him back saying that they deleted his account. Well almost... a reporter later found his empty profile and was still able to contact him through the network.

Industry and Academia Care about Social Networks

Social Web Technologies form a great point of intersection between industry and academia. The International Conference on Weblogs and Social Media is a young conference with strong support from industry leaders such as Microsoft and Google (Along with a host of smaller and more specialized companies).

Academia has spent a great deal of effort studying the social web. Infact, by comparing the results from google scholar to other more traditional research areas of computer science we can see that social networks are a strong field of study. Listed inorder are the queries and their result counts: first comes algorithms with 4.2 million, then social network with 2.5 million, followed by machine learning with 1.9 million, and 1.8 million for natural language processing,and artificial intelligence with 1.3 million, and finally the semantic web with .3 million.

Monday, February 11, 2008

Computer Books Riding the Web Revolution

So this post may not be of the strictest relevance to the social web technologies, but I think its greatly relevant to the project we're all going to be working on. is a paid site which offers a great deal of programming resources, web-accessible through an online book reader, at a monthly rate.  You pay a different rate depending on how many books you want to be able to read each month, and you're also able to grab up print books at a 35% discount if you want to own a permanent copy.
I also contacted the company about a reduced student rate, and they let me know that if you open an account and email their tech support, you can get 5 books out at a time for $10 a month.  I know theres probably a couple of resources I want to pick up in order to get up to speed on Java again, not to mention working with some of the frameworks, but its not the kind of thing I'm going to want to plunk down hundreds for just for this semester.  I'm not affiliated with the company in any way, I just know other people in the class are probably poor college students like me, and I was able to get all the comp sci books I'm going to need for this semester through this site for less than the cost of one book.  Hopefully this is useful and not seen as spamming up the blog, but if it is, feel free to revile me in the comments.

Open Source + Java + Web = Good

A simple formula, but no less potent, the combination of freely available platform independent open source java tools like the Rome Parser or Apache Feed Parser have made it easy for developers to integrate web 2.0 into already existing technologies or to develop new technologies around them.

The open source community (and the java open source community in particular) has a very good track record of supporting the web as a whole, from its HTML Parser to its JDOM Parser, to its Apache Tomcat web server all the way back to the Apache Httpd web server, which is still the most popular web server in the world. Truly, java and open source have done wonders both for the web and with the web.

As Paul's choice and history shows, open source on the web is a smart way to develop.

Mobile Social Networks

A recent post on the New Scientist Technology blog highlighted the research of Vassilis Kostakos who has developed a FaceBook application that uses mobile phones and bluetooth to "introduce" you to people you get close to in person. CityWare uses a local bluetooth network that detects phones running the application that are close to each other. The New Scientist post also refers to another application Kostakos is writing that will update you with news about friends that are physically close to you. This work raises the usual privacy concerns outlined in OpenSocial: Pros and Cons and I think illustrates the innovator case of boyd's post.

Sunday, February 10, 2008

Open Source + Social Web Applications

So, a few weeks ago at work I was tasked with setting up a weblog for our project to use, in which any of the members of our team could post information and assign different categories to the post. The goal of the blog was to replace paper logbooks, allowing easy searching and archival of data (instead of the old "fill up a cabinet" method).

In searching for the best software to use, I came across two products, Movable Type and WordPress. I was initially drawn to Movable Type, but after playing around with it a bit and reading up on the licensing, I chose WordPress. Currently to use MT in an Enterprise environment, you need to purchase a license for the premium version. Now, they have a completely Open Source version that they released somewhat recently, but it seems to be stuck in a perpetual beta state, and I wouldn't feel comfortable using it in our environment because of that.

WordPress is pretty nice though, I was able to quickly and easily integrate it into our LDAP authentication database to enable single-sign-on for all our users, and there are thousands of plug-ins out there that enable all sorts of functionality.

In this case, the more open-source solution won out over the proprietary solution since we don't really have a budget allocated to purchase a commercial software license for blogging software. At the same time though, Movable Type had a few features that WordPress doesn't seem to (at least without finding a plug-in to perform the task).

So, which one is the right business model? Do you release something for free for personal use but then restrict your license so that if you are a commercial entity, you need to purchase a license? (a-la Movable Type, the QT graphical toolkit) Or do you just license your code under the GPL and then sell support contracs to those who want or need them (a-la the MySQL DBMS), hoping that people will pay? I'm not sure... My gut feeling is leaning towards the second, simply because in the end more people will be using your product, meaning hopefully more people will end up paying for support. But at the same time if people don't have to pay, will they?

OpenSocial: Pros and Cons

Recently, there's been a lot of discussion about Google's OpenSocial API. Blogger and researcher danah boyd posted an interesting article on her blog describing some reasons why she believes making social network data more easily obtainable is not a good thing.
...There's a lot to be said for being "below the radar" when you're a marginalized person wanting to make change. Activists in repressive regimes always network below the radar before trying to go public en masse. I'm not looking forward to a world where their networking activities are exposed before they reach critical mass. Social technologies are super good for activists, but not if activists are going to constantly be exposed and have to figure out how to route around the innovators as well as the governments they are seeking to challenge...

Tim O'Reilly's blog describes an opposing view. In O'Reilly's opinion, opening up social graphs will eliminate the false sense of security many social network users have. Security-through-obscurity is never a good thing, and by showing people how their publicly posted data can be used, it's hoped that they learn to better protect their information.

Another blogger on privacy blog makes the point that
Relationship information is not the property of individuals - it held in joint custody among all parties in a relationship...

If someone that I've 'friended' wants to somehow use a social network (that I'm a part of) do they need to get my permission?

Open Social Web's Bill of Rights is a good start, but obviously this has no real legal weight and depends on companies voluntarily following it. These issues should be dealt with now, while the technologies are still being developed.


Web application can be developed and changed very quickly. Web 2.0 is one of the main reasons it is constantly in Beta. Changes and tweaks can be done with great ease.

It seems some people are taking rapid development quite seriously. ReadWriteWeb has an interesting article about weekend code-a-thons. Rapid development retreats where teams work either with each other or against each other to develop web applications. The teams work to go from design to launch in just two days.

If I had a weekend to spare I would love to try this out. I think it would be an excellent weekend to learn and meet with other like-minded people as well. I imagine it would be like trying to do our group projects in a weekend.

New video API

Yahoo has launched video API "Yahoo Live". Yahoo Live service can be accessed through JavaScript and REST. This is an addition to the already existing API list Using these API you can query people broadcasting videos,find information on the broadcast and on peoplewatching the broadcast.A unique feature of Yahoo! Live is that other people with webcams can be part of the mainbroadcast that they are watching, as a coviewer. Yahoo now offer a list of 27 API . With everybody rushing ahead to release different API's,I doubt if they test it throughly before releasing it.

Saturday, February 09, 2008

RSS and Data Signaling

One of the newer trends coming out of the rss community is integrating the technology on the application end with the purpose of signaling the application of new data.  Two areas this is currently being employed on are in appcasting and in torrentcasting.  In appcasting, you're able to notify the user of application updates using a simple rss form - and since theres boilerplate code for the application-end code, this makes it really easy for a developer to incorporate.  Torrentcasting is employed oftentimes on series-related distributed over the bittorrent network, and allows for notifying and downloading everytime a new "element" of the series is added (A new episode of a TV show comes out, and your bittorrent client automatically picks it up).  The two technologies, which are centered around the RSS 2.0 spec, ads an interesting extension in the field of web 2.0 enabled web applications - A bittorrent tracker can more acutely feed its way into the desktop experience of a user with torrentcasting.  

Launch of Web 2.0 Security Forum

As everyone would have expected, finally a group of companies have started a security forum. These include Credit Suisse, Reuters and Standard Chartered.

Here is the article.

Security is a very important component of Web 2.0 considering the amount of data that is shared and number of users participating in creation of these data. It is a promising step taken by these companies that will only in help the growth Web 2.0 users and applications.

Though I am not sure if it is going to be significantly different than OWASP.

Data Storage

I was up late last night talking to my friend about various computer issues and he raised an interesting though. We will be soon creating data at a rate faster than our storage capacities are growing at.

"The lack of sufficient data storage capacity has legal compliance implications for companies facing data security, privacy protection, record keeping, data retention and other data usage requirements of e-discovery rules for litigation, the Sarbanes-Oxley Act, Health Insurance Portability Act and other laws, the report said." [Privacy Law Watch]

Social web companies have to think about both aspects: offering space to their users for their content and any data laws that may apply to the company itself. Data storage is a critical part of social web technology and I never gave it much thought before last night.

Issues with Social Graph API usage

One of the common spamming techniques in the social networking sites is
using specialized spamming software such as FriendBot/BuddyBot which are actually automated friend adders or the tools that posts comments/notes to multiple users. Such tools use the sites' search tools to reach a certain section of the users and communicate with them from a fake account. Now with the Social graphs, it would be easier for such bot tools to retrieve number of such related users.

Further, Social graphs api can be used as a tool by social engineering hackers,
to earn the undeserved trust by creating and exposing the the network of
weak social connections. This can be exploited further to carry out
phishing attacks.

The Facebook API

I've always been interested in the Facebook API but I never really got a chance to actually look up how to write a Facebook application. So, after a bit of googling I found a few decent tutorials and learned a few interesting things.

Here is a PHP Facebook tutorial, a Python Facebook tutorial and even a Java Facebook tutorial. I also found some helpful information on demystifying the application form.

To briefly summarize, it seems that you can write your Facebook application in whatever language you want. Your application will still needs to use the Facebook API as well as some FBML tags but developers are not limited to just FBML. This is because of the architecture Facebook uses to run the applications. A developer hosts their code on an external server and whenever a user makes a request of that application, Facebook makes a request of the code running on the external server. Facebook then can display the application.

Some pretty interesting stuff and certainly some good starting points for those of you interested in creating Facebook applications.

Personalized social search

In an interview with VentureBeat, Google’s leading VP in search, Marissa Mayer,
defined social search as "...any search aided by a social interaction or
a social connection… Social search happens every day. When you ask a friend
what movies are good to go see? or where should we go to dinner?,
you are doing a verbal social search. You’re trying to leverage that social
connection to try and get a piece of information that would be better than
what you’d come up with on your own."

she mentions social search has not shown its potential yet. Google tried to
implement social search by providing users, facility to annotate the search results
and allow those annotations be shared with people of similar interest.
They tried it in Google Co-Op., but the model didn't work very well.
Google is also carrying out an experiment to let users vote on search results.

The critical thing would be to make use of the search results on the similar topic
carried out by other users that have same interests. To retrieve such users,
it would make sense if Google utilizes the users' connections from their friends-list
at FaceBook or MySpace where one can get relevant social context. Further,
development of social graphs would be a noteworthy step.
So, in future, it won't surprising to see Google's PageRank influenced by
the connections on social networking sites and provide more personalized
search results.

Blogs: David vs Goliath

The social web has empowered the average citizen in ways that no other medium has. For instance. A Verizon FIOS customer had been seeing the personal information of another customer whenever he logged into his account. Verizon acknowledged the problem but failed to fix it after 8 months. He posted the story in his blog where it was picked up by consumerist and digg. Not long after, he received phone calls and emails from Verizon and his account issues were taken care of. These types of occurrences are less heard of in the world of tv, radio and the like. Why do you think that is? Do you think the social web might one day become more restricted?(Net Neutrality) Or is it too late to put the genie back in the bottle....

Friday, February 08, 2008

Data on social networking sites

This site here gives some data on market share of different social networking sites.I use orkut, and its surprising that it is listed somewhere in the bottom of the list. :)
Heres another data set The interesting column to watch out for is "Average visits per person per month" marked as yellow.

FOAF Consolidation and Editing

Following the FOAF gear link from Dr. Chen, I found this post that discusses how to consolidate other profiles (flickr, twitter) into a FOAF profile using owl:sameAs links. I like this in that it doesn't require you to import all of your other profiles into your FOAF profile but it allows you to make your FOAF profile the 'master profile' for everything. Unless, of course, you want to keep various networks you have separate. I'm still searching for access controls on FOAF profile data, but I do think that would be a viable way to allow a single profile to be maintained and only give access to certain parts of it to sites.

That post also lead me to knowee which is the start of a using linked RDF profiles to maintain a single social graph. The screencast is particularly interesting to watch.

Thursday, February 07, 2008

Directory of Web 2.0 sites

This site presents a list of Web 2.0 sites in Web 2.0 way. It is interesting to notice that attractiveness of the site in all aspects is being taken seriously. Rich user experience is one important characteristic of new generation web sites. Also, colourful logos these sites are using are adding to the attractiveness.

Either the search facility is not working properly or the sites that I think are Web 2.0 sites are in fact not Web 2.0 sites.
E.g. when I searched live I was expecting to be listed.
Search the tag Microsoft, you get Facebook in the result.

May be someone used wrong tags intentionally.

$1,100,000 Social Networking domain

I stumbled upon this story about a UK-based travel agency's absurd purchase of a domain name with the intent to create a social network around cruisers and the vacations they go on. IMO a vacation-based social network would be great, I think the agency has some work to do in the area of design and marketing if they wish to see a return on their investment.

A Not So Small World

In class on wed Dr. Chen briefly mentioned the well known 6-degrees of separation between any two people idea. However, in a presentation by Haym Hirsh on the future of AI in relation to the NSF, Dr. Hirsh mentioned that the well known idea of 6-degrees of separation between any two people is actually false and that the poor were largely unconnected to the rest of us.

This article points out many of the flaws with experiments designed to test the small world hypothesis, including unrepresentative samples, and a low success rates within experiments. Its an interesting read on an idea I had largely accepted and adapted to without proof.

We may not be 6 degrees apart

In class we talked about friend-of-a-friend networks and how everyone is supposedly only six degrees of separation away from everyone else. It turns out that Milgram's famous small world experiment didn't actually have the incredible results that we commonly associate with it.

The experiment gave envelopes to people in Kansas with the name of a target person and several details about that person's life. Participants were asked to pass the envelope on to someone they knew who could get it closer to the target. Some of the stories sound pretty amazing:

" envelope that made its way from a wheat farmer in Kansas to the target, a divinity student’s wife in Cambridge, Massachusetts, with just two connections."

This recent article from Discover Magazine discusses some of the limitations of Milgram's experiment. The studies had a completion rate of only 5-30%. Judith Kleinfeld's paper, "Could it be a big world after all?", describes how after extensive research Kleinfeld found only two replications of Milgram's work, which is pretty low for something so universally accepted as truth. Another draw back with the chain-mail approach of the original is that because participants can't see the entire network, they may inadvertently send the letter further away from the target.

With the popularity of social networking sites comes the opportunities to further research the idea of six-degrees of separation. Many of the drawbacks of Milgram's experiment could be eliminated. Researchers looking at the data from social networking sites can see the entire social network and find the shortest path. Completion rates aren't an issue if computers are tracing the connections. One example, LiveJournal Connect, tries to find a path between two LiveJournal users. You could probably get some pretty interesting results with something similar on Facebook.

Security? We'll get to that later.

There have been a couple of posts about security here, and we have seen an example of a website getting hacked in - what was it? minutes? - in yesterday's lecture.

So why do we see security and privacy problems popping up in the Web 2.0 world?
One possible explanation is that if you want to share things with others (and Web 2.0 is all about sharing), then you must sacrifice a certain degree of privacy. Also, if your code is open-source, then hackers will find exploits by merely looking at your code.

But there's another explanation: this article suggests that developers are just too eager to implement new features, and security gets neglected. This phenomenon is not new: the same thing was going on when new and exciting desktop apps were coming up, the philosophy being that the priority is shiny new features for the user, and a stable and secure back-end is merely an afterthought: something that would be nice to have, but no one will notice if it's not there.

FOAF network visualization

FOAF is an ontology for describing our online profiles and relationships with other people. It's not too interesting for the general users to view FOAF documents in their RDF form. We need visualization tools.

Couple interesting prototypes:
You can read more about the FOAF gear here.

Why People Join Social Networks

Zooming around google I came across Why People Join Online Social Networks. It offers a slightly more detailed list of reasons as to why someone might go and join a social network than our in class slides [slide 5] did. It seems nobody bothered to expand his list after it was posted but, off the top of my head, I think I can add one: to play a game. Aside from the variety of Massively Multiplayer Online Games there are plenty of websites like that allow users to come together and play simple games.

I Divorce U

Heres a Washington Post article on the recent issue of text messages as a medium for issuing a declaration of divorce in Egypt:

Whats astonishing about the article is that apparently, the issue has been tackled all over countries under Islamic law already, in Qatar, UAE, and Malaysia.  I hardly know what to say on the issue other than "Wow".  On one end, it speaks volumes to how engrained in every day life technology, and especially cell phones, has become.  I can text message 12 characters out just to get a myriad of services: I can order a pizza, break up with my girlfriend, get movie times and weather information, if I'm in the mood I can even "Text 35523 to have a good time!!! for a nominal fee"

On the other end, its just gotten to this point where we spent all this time trying to make our lives easier through integrated networks.  And now that we're here, theres forces out there that just want to push our entire lives into these integrated networks.  I'm all for convenience, but come on, part of life is interacting with other people verbally.  Maybe its cool that I don't have to talk to some deadbeat answering the phones at Papa Johns, but you know, maybe it'd be a good idea to divorce your wife in person.

Wednesday, February 06, 2008

Simple news to map mashup

Here is a fairly straightforward news feed to map icon mashup that helps viewers to learn about "global terrorism and other suspicious events". This appears to be a one-man shop and the news items are manually edited- perhaps that is why they all do appear interesting. E.g. 4 missiles found in Italy, and undersea Internet cables that were repeatedly cut in the Middle East. I thought the web site was pretty well done so worth mentioning. Here is an interview with the site creator and news editor.

Tagging leading to semantic web

I'm using gnizr at work for researching and collecting documents of interest to a client of ours. Gnizr provides features I find useful for organizing a pile of documents. Tagging documents with terms of interest is much more flexible than strictly partitioning documents into specific buckets, and the clustermap feature makes it possible to explore the soft-association of documents and terms of interest. I also find the geonames machine tag very convenient for recording when documents are associated with places.

What gnizr is not letting me do (yet) is to record more explicit information- who authored a document, when it was published, what institutions a person or a document is affiliated with, and so on. I think this is, in large part, the promise of employing semantic tags. When we tag a bookmark (in my case, a document), we are asserting that (at least to ourselves) the tag word is related to the document. That's about it. When we tag a document with a geoname, we assert something a bit more specific- that the location is related to the document. With this information alone, we can create a semantic graph whose nodes are the tag words, documents and locations, and whose edges are the relations we implicitly create when tagging. We can take this one step farther and imagine that one tag may be related to another if they are both used to tag the same document. The tags are at least related by that common document. The aggregate tag relations semantic graph can be used as the basis of a tag recommendation system (one of our suggested projects I'm interested in).

I think that by using additional machine tags based on FOAF relations or other standard metadata (e.g. Dublin Core) users could encode more explicit knowledge, (Bob is the author of document x), and thus richer semantic graphs. I'm suggesting that a semantic social bookmarking / tagging system could provide an easy and effective user interface for generating semantic graphs. I think the idea requires further elaboration and refinement, but I'm optimistic that a tighter integration of gnizr and additional semantic graph interaction tools could provide a nice path toward a really useful semantic web platform.

Typically by the time I think of an idea it's become passé- this case is probably no different. So I Googled for "social bookmarking semantic graph" and found a noteworthy blog post that explores extracting semantic relations from tags using tag co-occurrence and frequency. There you go... at least one of the key ideas has already been floated! I'd better get coding!

A Social Network For Everyone!

In reading "Social Network Sites: Definition, History, and Scholarship" I ran over a single mention of Ning which is a platform to create a social network. There are already 8,009 social networks on Ning and the Popular Networks page shows networks ranging from Horse Aficionados to Disc Golfers to school affiliation networks. A cool feature of that page is that there is a tag cloud linking to tagged networks. Although clicking on 'food'  makes me wonder why 1Club.FM Radio Portal is tagged food...

Anyway, I suppose it was a natural evolution with the popularity of social networks someone would build a platform to create new social networks on.

The Machine is Us/ing Us

I have always been online, I grew up in the online generation with an internet connection in 7th grade. I saw the internet change and develop, but I never saw Web 2.0 as any sort of revolution, I simply saw it as slow changes here and there. When I stumbled upon this you tube video a few months ago, it was a mind bending experience for me. The internet did change, and I guess I was one of the last people to realize it. It's only about four minutes long but is a good introduction to the change in technology behind the internet.

This video was made by Professor Wesch, a Cultural Anthropologist at Kansas State University and his point of view is fairly unique compared to many other experts because of his background.

The video can be found at:
Professor Wesch's page is:

Who (doesn't) need Newspapers?

I was going to write this as a comment on Justin's post, but as my ideas came together and the words started flowing, I figured I would generate a separate post. I wonder what Dennis Mahoney would say about this ;-)

Who needs (hard-copy) Newspapers?

I don't.

When I've got a computer handy, I can navigate to, whose iGoogle home page/news aggregator displays all my favorite RSS feeds.

On-the-go, I can whip out my Palm Treo 700p and navigate to feedm8, a handy mobile feed aggregator that gives me access to the same content, simply in a more accessible format. As long as I have cell coverage, I am good. But this requires paying for an unlimited data plan on my phone, which can be expensive.

But what about people who aren't Internet-savvy? How about those who don't have fancy smartphones that can display newsfeeds, or can't afford a cell phone plan that supports data? Or what if a person wants to simply sit outside in bright sunlight and read w/out squinting (many computer and smartphone screens are somewhat unreadable in bright sunlight).

For those people, a paper newspaper is still probably the only viable option for getting news on-the-go, or even at home. Many people in this country and around the world still don't even own a computer, or don't have reliable Internet access.

I would argue that until there exists a device that is completely portable, has a large screen that is easily readable (and functions well in direct sunlight), has a very good battery life, and has ubiquitous access to Internet newsfeeds wirelessly from anywhere, and the cost of such a device and the wireless service are close to negligible, will paper newspapers be obsoleted. Also, most of the population would need to have one of these devices, and know how to use it.

I believe that Amazon's Kindle product ( is a great evolutionary step. Its screen is beautiful (and works great in the sun), its battery life is impressive and it's ability to access the Internet at broadband speeds from anywhere in the US (that can pick up a Sprint cell phone tower) w/out requiring a wireless data subscription is key. But, due to the nature of the device, Amazon had to lock things down so that it can only browse to a limited set of web sites, including (to download eBooks and purchase things) and Wikipedia. Also, it costs $399. Many people can't afford that. And it only works in the US (so far) But it does offer instant access to many newspapers and blogs. From the Kindle home page:
  • Top U.S. newspapers including The New York Times, Wall Street Journal, and Washington Post; top magazines including TIME, Atlantic Monthly, and Forbes—all auto-delivered wirelessly.
  • Top international newspapers from France, Germany, and Ireland; Le Monde, Frankfurter Allgemeine, and The Irish Times—all auto-delivered wirelessly.
  • More than 250 top blogs from the worlds of business, technology, sports, entertainment, and politics, including BoingBoing, Slashdot, TechCrunch, ESPN's Bill Simmons, The Onion, Michelle Malkin, and The Huffington Post—all updated wirelessly throughout the day.
So, it seems we are approaching the age of the obsolescence of newspapers. But we aren't quite there yet. For those with enough disposable income and enough tech-savvy to afford or use a smartphone or Kindle, newspapers may be obsolete. But for everyone else, paper is cheap and still works as well as it always did for conveying information.

Resurrected: The Industry Standard 2.0

The Industry Standard, originally founded in 1998, was the go to publication for all things internet. Startup .coms of the web 1.0 era flocked to advertise in this widely read publication, which attracted readers from all over the tech industry. In 2001 however, The Standard collapsed as a result of the .com bubble burst as investors steered away from anything built on or around .com.

Today, The Standard is re-branding itself under the social networking banner as an online-only publication which consists of a concoction of imported feed stories from around the net, well-known industry writers, freelancing journalism and analysis, and community contributions. The social networking aspect comes in the form of a Prediction System wherein members will have the ability to bet on the outcome a any given story (with virtual money). The prediction is a percentage based on who votes how much each way; wagering more on a story affects the prediction.

But is this just more of the same? One could view the prediction system to be a remodeled digg-like voting system and that we don't need yet another site to tell us about the Microsoft bid. This may end up being just another cliché site which will gather a core audience of contributors. Now if they add in real-money betting... that would be interesting.

Check out the new Industry Standard and try it for yourself.

Tuesday, February 05, 2008

Who needs Newspapers?

A few months ago a group of children came to my door selling newspaper subscriptions. At the time I didn't subscribe to any newspapers, and I still don't. The era of the newspaper is coming to an end.

The presence of online news feeds, local and global bloggers, and other web 2.0 technologies provides me with more up-to-date news than any printed paper could hope to accomplish. Furthermore, I can see so many varying points of view for any given event that I can form a more balanced view of the news.

Take the recent story about Farc Protests for example. In this piece they say that the original protest started from Facebook:

"The protest was started less than a month ago on the social networking website Facebook by a 33-year-old engineer, Oscar Morales, from his home in Barranquilla on Colombia's Caribbean coast.

Over 250,000 Facebook users signed on, and the movement was taken up by newspapers and radio and television stations across the country. "

And if you had any doubt about the veracity of the story, or wanted to see if it had an spin on it, you could just check Google News with the right query words. Then you would have noticed that many nations in addition to Colombia were protesting against FARC and that the protest did spring from FaceBook.

Simply put, why would I want to subscribe to a newspaper nowadays?

Monday, February 04, 2008

SAT scores and books... presents a visualization of SAT scores as related to favorite books listed on Facebook profiles. Although i dont know if I agree with the subject matter. I think its a pretty unique manipulation of the data available on popular social websites. Which brings about the question that Mr Chen asked us the other day. What kind of uses of personal data should be acceptable from social websites?

Thin Versus Fat Clients

Last class we had a discussion on the relevancy of desktop computers. Will desktop computers be obsolete in the near future?

A thin client is a bare-bones computer that relies on a central server for data processing. Desktops, or fat clients, perform data processing locally. For a business the benefits of a thin client versus a fat client setup are quite numerous. Primarily, a thin client setup is cheaper to use since thin clients require cheaper hardware, less power and, due to all processing occurring on a central server, thin clients require less administrative attention.

In an office using thin clients an application update occurs in one place: on the server, and all of the client users are upgraded with out any hassle at all. If a thin client fails it is inexpensive enough to simply throw away and replace with a new one. The user simply logs into the system on the replacement client and can resume work.

As we discussed last class, applications like Google Docs are already helping thin clients become a reality. If all a user is doing is loading up an internet browser, is there a reason for a powerful desktop computer? If all you are doing is updating a spreadsheet or working with email, probably not. I do not believe that desktop computers will ever become obsolete as there will always be a need for processor intensive activities, but I do believe that desktop computers will become a lot less common, at least as far as businesses are concerned.

Barack Obama meets Web 2.0

Barack Obama's campaign page has a social network attached to it, accessible at  The social network offers a new twist on organizing grassroots campaigning by linking users of the site with people in their towns.  It also allows people to create fundraising events, and join existing events (akin to facebook events).  The site really embraces the whole web 2.0 mashup idea, with videos driven by the youtube mini-player, google maps mashups to locate events in your area, and facebook applications to link Obama to your page profile.  I'd be interested in seeing metrics for how this might translate into increased awareness / primary votes, but what is certain is that it offers a unique way for a candidate to organize campaign efforts in a central location.  Its especially beneficial, im sure, with the Super Tuesday vote complicating campaigning efforts.  His diehard supporters can independently organize large events, while requiring little effort on Barack Obama's part.

On a last note, Obama's site offers a calling option for his supporters to enter into his calling campaign, from the comfort and convenience of their own homes.  Again, I'd love to see metrics, but I'm sure this allows Obama to place a lot more phone calls than any of the other candidates.  If nothing else, I'm sure come 2012, a lot of these techniques are going to become a lot more commonplace in the candidates campaigning efforts.

Microsoft + Yahoo : A Web 2.0 View

Microsoft's proposal for Yahoo is most likely to shake up the online advertising, Web 2.0 and open-source markets. As now a days Google comes up as a un-disputed chief player in current web 2.0 market, which cannot be compressed by Microsoft alonewith product such as sharepoint, though is a good challenger to google API . Both MS and yahoo dominate free (Web 1.0) online e-mail services, Hit wise data showing Yahoo Mail / Yahoo! Address Book at 58 percent of market share (#1) and Windows Live Mail (#2) at 25.5 percent share.Yahoo is on its way to being a real player in the Web and open source world., They have released tons of code via their developer programs and pushed some really innovative services aimed at Web developers some examples could be maps and flicker. This tie-up willprovide a good competition to google, specially some required features such as enhanced security and experts eye.Hope this proposed merger will add a new dimension in web 2.0 Development and also bring some more for developer and ultimately to end-user.

Sunday, February 03, 2008

Long tail theory and web 2.0

Actually long tail theory was valid before we entered the web 2.0 Age. For examples, grocery stores rely on the big amount of sale to make profit, while some luxury car manufacturers can still survive by selling only several hundred of cars. At this point I do not think long tail theory is directly related to web 2.0.

Copying web 2.0 site is not always successful

EBay in China has retreated due to the decrease of their business although they entered China market much earlier than TaoBao, a native company which has similar business to EBay at its early stage. It is said that Ebay suffered in China market because they did not tailor their business model according to the characteristics of China market. So simply copying models does not necessarily mean success.

Please follow the link below for details Is eBay Bailing out of China?

Google releases social graph API

Google just released social graph API. This API has tried to solve the problem of defining link relationships by providing multiple choices like contact,
friend, met, co-worker, colleague, co-resident, neighbor, child, parent, sibling, spouse, kin, muse, crush, date, sweetheart.
But, when interpreting any social website, there is still the problem of time dependent relevance of such links.

Apple and Web 2.0

For the past eight years, Apple has been doing their best to jump onto the Web 2.0 bandwagon. In 2000 they introduced a free "iTools" service available only to Mac users that provided an e-mail account, easy webhosting, photo upload (the iPhoto application could put entire photo albums online automatically via iTools with a single click) and Internet-based system backups/synchronization across systems. They replaced the free iTools service with the subscription-based .Mac in 2002, which would cost users $99 a year. There wasn't really anything all that revolutionary about the iTools/.Mac service, other than the ease at which people could upload pictures. At that time, sites like Flickr, Google's Picasa and Facebook's Photo albums didn't exist. It was (and still might be) the easiest way to get your photos on the web.
In April of 2005, Apple released the Mac OS X 10.4 "Tiger" operating system. This version of the OS added a new feature dubbed "Dashboard." Dashbaord is essentially a special layer not normally shown in which special mini-applications dubbed "widgets" live. The layer is summoned and dismissed using a hotkey, and widgets can be dragged around while dashboard is displayed. The key aspect that makes dashboard relevant to Web 2.0 is the fact that widgets are programmed in HTML, JavaScript and CSS. They essentially live within special micro web environments, and can easily fetch information from existing web sites or provide specialized interfaces such as a weather grabber, new e-mail query for GMail or a current Orioles baseball scoreboard. The possibilities are endless, as any data that exists on the web can be clipped and displayed in a widget. Other similar widget engines are available (Yahoo provides a free one that works on both Mac and PC), but Dashboard is probably one of the most widely used, mostly because it actually ships with every copy of Mac OS X.
Apple's 2007 entrance into the mobile phone market also marks their latest attempt at leveraging Web 2.0. Rather than provide an SDK to allow developers to make custom applications that could be compiled to run on the iPhone, Apple instead insisted that the advanced Safari web browser on the device would be sufficient to run any application that users required, thanks to it's advanced JavaScript/AJAX and CSS rendering support. Developers would simply deploy their AJAX-based Web 2.0 apps to the web, and users would bookmark them.
While it was a good idea in theory, the developer base objected not being able to develop advanced local apps that could take advantage of the full feature set of the iPhone. Within a month, developers had assembled a rogue toolchain that could build binaries that ran on any hacked iPhone. With each successive release, it's been a game of cat and mouse between the hack developers and Apple, coming up with a hack that would allow third-party software to run. Eventually Apple conceded, and will be releasing an official SDK later this year.
The latest iPhone firmware release that just came out in January 2008 adds the ability to create a home screen icon for Web 2.0 applications as an alternative to bookmarking them within the web browser. This finally puts the web-based apps on the same level as the Apple-provided apps, and gives users a more intuitive interface for launching these apps. Perhaps if Apple had provided this from the beginning, developers wouldn't have been quite as eager to hack the phone to get their apps on there. Regardless, there are a number of innovative and useful Web 2.0 apps already available for the iPhone, and the addition of native apps will only make the phone a better and more versatile platform for both users and developers.

Politics and Web 2.0

The Internet is awesome because everyone can access information; something that politicians don't always want. As mentioned in a Wired article, information about candidates and their fund raising spreads quickly and is definitely available. I think politics is a little slow on picking up on the networking aspect of social web, but the voters certainly have jumped on it (see Obama Girl Video).

Mobile Web 2.0

With the spread of broadband enabled mobile devices come new possibilities
for the way we interact with the mobile internet. For instance, Ajit Jaokar discusses a concept called "spatial messaging". What he describes as the ability for someone to take a picture of a location, attach some text or short comment, and then being able to attach the picture and text to that location. The goal being that when a friend passes through that picture and text would then appear on their mobile device. That being said, one wouldnt think applications like that would work well on stationary pcs. Do you think this means that mobile and stationary websites will remain distinct throughout web 2.0?

Saturday, February 02, 2008

Insurance companies and web privacy

In class on Wednesday we discussed how insurance companies could use social networking or blogging sites to discover things about you. Apparently insurance companies are already looking at sites such as Facebook and MySpace.

In a lawsuit currently in New Jersey federal court, several families are suing Horizon Blue Cross/Blue Shield. The families are accusing Horizon of denying claims submitted for treating their children's anorexia and bulimia.

The judge in the case issued a court order requiring the families to turn over emails, diaries and any writings "shared with others, including entries on Web sites such as 'Facebook' or 'MySpace.'"

From the article:
Horizon claims that the children's online writings, as well as journal and diary entries, could shed light on the causes of the disorders, which determines the insurer's responsibility for payment. New Jersey law requires coverage of mental illness only if it is biologically based.

Horizon wants to use the kids' web postings to show that their eating disorders are caused by emotional problems and do not have a biological basis.

Imagine if insurance companies went back and read your blog postings and think about what they would be able to do with that information. They could see when you started complaining about symptoms of disease X. Even if you weren't diagnosed with disease X until after you signed up for insurance, couldn't the company use your own writings to prove that you had a pre-existing condition? As we discussed in class, insurance companies could look for people who post pictures of themselves drunk at parties or smoking and charge them higher premiums for being a "high risk".

Maybe you didn't post anything incriminating, but what about your friends? Could lifestyle factors that make you a higher risk be determined by analyzing who your friends are?

Friday, February 01, 2008

Republicans Hate Their Candidates

For the past year I've been working on a system to monitor sentiment in political blogs. PolVox is a working prototype that monitors know political blogs. It provides trend analysis charts and a keyword search interface. While it still has a lot of bugs in it (Like double counting some things) I would like to point out a few things to you.

My system shows that republicans hate their candidates. First up is John McCain, followed by Fred Thompson, and Mike Huckabee. The only candidates that seem to be doing ok are Ron Paul and Rudy Giuliani. Since Rudy is gone and republicans don't take Ron Paul seriously they're stuck with a distasteful set of candidates.

Meanwhile, Hillary Clinton and Barack Obama both are doing quite well within their own parties.

Does 1 Web 1.0 company + 1 Web 1.0 Company = a Web 2.0 Company?

This isn't a look at a controversial social web business model, nor is it a look at a successful web 2.0 company. But I think it is huge news in the web business: Microsoft has offered to buy Yahoo. Both companies have been struggling lately and from my reading, it appears that Microsoft thinks a combined company will provide a strong competitor to Google in the online advertising business. I tend to like reading Paul Kedrosky's analysis on these topics and he says that it really may not make much of a difference. Combing two failing web 1.0 companies won't create a company able to compete against a web 2.0 company. In fact, in a later post he mentions that Google would be able to exploit this to grow its market share. As someone who has been through one corporate merger and will likely be going through another one in the next six months, productivity suffers greatly during mergers. 

Privacy Issues In Social Web. A Good Real Example is Us!

Few days ago, Dr. Chen sent me an invitation to the Weekly Blogging Assignment spreadsheet. It looked kind of strange to me. Because it contained UMBC Capus IDs instead of just the email IDs. Later I was told that it was done for privacy reasons.

Well, it was a good decision to go this way. But maybe he should not have trusted the app he chose to host this file. Yes, Google docs could reveal some information anyway.

There are more than one ways to know which ID belongs to which user. First, when you open the document and open the "Discuss" pane on the right, it shows a color box in front of the user(s) who is currently editing the document. And the area on the document that this user is editing is also shown with the same color. Assuming that users work only in their respective rows, one can know which Campus ID belongs to which user.

Another more easy way to reveal this is by viewing the "Revisions" to the document. This is self explanatory.

Thats why I say, we need security first.

Another minor issue: I am able to see unpublished drafts some people have saved on blogger. Beware, others might steal your posts/ideas :)

If someone's IDs got revealed because of the picture above, or there are any serious implications please let me know and I will delete this post.