👋 Hello World

8 questions about the web you always wanted answers to

The most popular 10,000 websites analyzed - 8 Questions & Answers

Last week I had the urge to do some real quantitative data analysis. After several days of programming data collection scripts, I compiled a huge database of data. If you want to see the source of my results check the raw outcomes on this page. In this post I’ll answer the most interesting frequently asked questions about the web (NSFNN* alert):

Is porn dominating the web?

From the 10,000 most popular websites, 10% is marked as adult oriented. Sounds like a lot, but the total reach of these sites are a mere 5%. So the answer is no, the web isn’t all 18+

Interesting is the fact that The Netherlands (population 16M) is 3rd in owning adult websites. As they say: where a small country, can be big. The USA is of course number one and China doesn’t even appear in the list although they own 10% of the websites.

Is China taking over the web?

Luckily the answer is short: No. The USA owns 44% of all websites, but China is coming second with 9%. That is less than the number of European websites with 16%. But in reach Europe loses from China with respectively 7% against 9%. Conclusion is that Chinese language courses aren’t necessary yet.

Hola, 你好, Konnichi-wa – excuse me, what language?

Although one might argue that my data is off (more people understand Chinese and Spanish than English ). A majority (55%) of the websites are English. Chinese takes second place, and third is Spanish. Arabic is also well represented with 3,3%. All the other languages don’t take a significant part of the web. Sorry French people.

Are all websites made in Silicon Valley?

This is actually more or less true. From all US states California (37% reach) has a significant advantage over any other state, it actually owns 7% off all the identified websites in the top 10,000. Second comes New York in number of websites but Washington has a higher reach (22%).

I was already link building my Geocities.com account!

Maybe link popularity wasn’t a hype in the early days of the web. But the data does show that geocities.com deserves that pagerank of 10/10 because it has over eight times more incoming links than google.com (2nd). So link building might not be very hip and trendy as you would expect.

The most linked list continues with the usual suspects like Adobe, Amazon, Microsoft, Wikipedia and Apple. I don’t know why, but third - with 260,000 incoming links - is some Chinese website (http://miibeian.gov.cn). Does anybody know what it is? Update: explained

Is it true that Yahoo and MSN are more used than Google?

The statistics are ambiguous on this. But going from my data Google actually has the biggest reach (9%) if you add all 72 local domains together. In number of views Google loses from Yahoo! that has 12% (!) of the total views. (Damn you, Yahoo! Games)

About MSN: I personally only happen to land there if I mistype a domain, or check my spam (hotmail) but they still seem to take 4% of the total reach-pie.

Has the web evolved to web 2.0?

Web 2.0 is hard to measure (Maybe because it doesn’t exist). But I’ve tried by location RSS feeds and stylesheets. And the results are actually quite surprising. 10% of all the homepages provide an RSS feed (If people actually use these RSS feed is of course a different analysis).

And 58% use stylesheets on their homepage for layout.

So maybe we can conclude there is actually some evolving going on, and ’s aren’t dominating design style anymore.

Why do I always see ‘ads by Goooooogle’?

I’ll tell you why: 6% of the homepages contain Google ads! That is a 55% reach of all advertising networks identified. And it gets even better. If you add the Google ads on the Google search engine they have a total 12% reach. The Doubleclick network has a reach of 7% with 403 websites. This is - more or less – also nice.

*NSFNN: not safe for NOT nerds

*** For comments drop an email to the address on the right.

Posted by

Why Can't I Change?

some thoughts on change

It’s in our human psychology to keep the status quo: we prefer going the route we were going all along. The opposite of the status quo is change. Humans are very bad in initiating change. Change means that you have to put effort, it’s unpredictable, creates risk, and worst of all: means that we were wrong before.

Changes are often wanted to improve a current situation. At a certain point you have to decide to change while the option to continue is still open. Visualize this as a crossroad where you can continue, but also change, and turn. Wanting to change rarely succeeds.

Most changes made in our lives are forced changes:

  • Forced into change: at some point you are at a “T” intersection, forced to make a decision because continuing is no option. The change is often postponed as long as possible.
  • Gradually change: at some point you are at a “Y” intersection, where continuing straight on is no option, but a decisions (thus change) has to be made.
Our day-to-day decisions are made unconsciously through use of heuristics. It’s too complicated for our brains to think everything over. Change has to be initiated by our conscious mind because our unconsciousness will prefer status quo and heuristics. These separated parts of our brain don’t work well together. And the old heuristics conflict with the change wanted by our consciousness.

E.g. smoking
Although all signs and information are indicating that people should stop smoking because it makes them sick, of all the millions of people that smoke, around 70% of them want to stop. To quit smoking is one of those changes you have to decide upon, put effort into, and you would have to do on your own. But only few (6% actually succeeds) are able to succeed to change (stop) without having to have a doctor telling them that it is quit or die (“T” intersection).

Question
Think about it:What have you ever consciously changed in your life?

Posted by

Create your own Tag Cloud - Easy!

For a website - that wanted to be very web 2.0 - I had to create a tag cloud like this on del.liou.us or at flickr. People think they are cool and useful, so who am I to disagree?! Why re-invent the wheel every time, when we have the internet as an unlimited source for code stealing examples.

So as part III of my coding-give-aways* (I,II) I give you:

The Tag Cloud Creator

1) make a $variable with all the words you want in your tag cloud.

2) grab this php example file that is only 30 lines in size (you can use it any way you want)

3) include it somewhere on your site, upload it to your server and - if you are not the dumbest nerd - you should get something like this:
image

Digg.com - as search cloud or try it on other sites.

5) now you have created your own tag cloud to use for a searchengine, photoarchive, or whatever you want.
So have fun, and tell your social community friends.

Posted by

Easy Fuzzy Logic with MySql – The end of “no results found”

As a web programmer I ran into the problem when running a complicated (user) search on Mysql that the results are too strict, and thus giving the well known error “no results found”. While good (although not perfect) results exist!

The problem

When a traditional search query is initiated, sql queries are being generated in the terms of:

User search: where tv_manufacturer=”sony” and tv_description =”%widescreen%” and tv_price < 1000;

A user is asking for a Sony television AND that is widescreen AND less then 1000 dollar. This will show very accurate results. But limits the opportunities when (a best matching) TV is $1050. The users would be okay with paying $50 more in real life. But our query won’t allow it. We want to have that (almost perfect match) results shown!

This query can be rewritten by replacing the AND with OR in the query, but by using OR we get inaccurate results because results will show any TV below 1000 dollar OR any Sony OR any widescreen - useless.

The good news is that we can solve this without having to ask a user the factual and nerdy: WIDESCREEN AND (SONY OR 1000 DOLLAR) – way to difficult.

The answer is in what is named ‘fuzzy logic’. Fuzzy logic is more natural and (semi-) intelligent by mathematical logarithms:

User search: a preferably Sony TV with widescreen support for more or less a 1000 dollars, I prefer less. Please.

A few specialists software company’s offer fuzzy logic software, but this is highly tailored to the specific needs of the system.
But Mysql has a solution, with a few hacks will result in accurate results.

The solution:

The solution is to be found in the “MATCH AGAINST” function of Mysql. It is a text matching system where you can add your preferences, and the query gives points to indicate the score in matching.
Very few people use this, maybe because they are disappointed that it is only matching text. But in this post I will show you how to also integrate a (in the real world less strict) demand like: less then $1000.

We do this by encoding the numbers to a word. In this case the TV price of our tv in the database will be encoded to unique words like “pricemaxthousand”, etc.

All the features of the TV are being stored in a new (text only) column named encodedsqlrow.
So we get this: encodedsqlrow = “sony widescreen pricethousandtotwothousand diagonalthirtyinch”.

With the match against function we can also search “IN BOOLEAN MODE”. This will add ‘preferences’ to every search demand (word) in our query.

The preferences you can give to a demand (word) are in the order of:
+ = Obligated
> = Important
~ = More or less important
- = Without

And last but not least, we can retrieve a score with every results. So the most accurate results can be listed at the top.

With all this together we (a user) can create a search query that will results in more natural human-like picked results.

Creating our query:

if($demandpricemax)

< 1000)
$encodedsearch = “>

sony +widescreen ~pricemaxthousand”;

Getting the score:

Select tv_manufacturer, MATCH (encodedsqlrow) AGAINST (’$encodedsearch’ IN BOOLEAN MODE) as score

Setting the match search:

WHERE MATCH (encodedsqlrow) AGAINST (’$encodedsearch’ IN BOOLEAN MODE) ORDER BY score DESC

Example Page – integrated:
For a dutch website I made this function so it matches all studies (1800) against the many demands of a to-be-student. Like he could say: I am searching for a study obligated in Amsterdam with more or less important in the economic field with important average workload important mostly female on a more or less important university.
Many demands, and this will result in accurate results that include studies in Amsterdam although it has mostly male students.

Have any questions or want to bash this text: email address is on the right hand side of your screen.

Note: the database column (encodedsqlrow) must have an FULLTEXT index (via phpMyAdmin the blue “T” the at ‘actions’. This will make it searchable for the MATCH AGAINST function. Else it won’t work.

Sources:

http://en.wikipedia.org/wiki/Fuzzy_logic
http://www.seattlerobotics.org/encoder/mar98/fuz/flindex.html
http://www.wcc.nl/
http://www.kiesjestudie.nl/l-studietest.html
http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

Posted by

Easy AJAX inline text edit 2.0

As everybody knows, refreshing pages is so 1999. AJAX, DOM, whatever you call it makes it possible to let people edit a piece of text inline without having to use a submit button.
You say: but that ain’t new at all! I say: But all of this has been made easy to use and implement: 2.0!

Example page: inline edit (no JS knowledge needed) [source ] | Inline example: Please edit me!

How you can make it work (5 easy steps for integration)

  • Download this javascript file: InstantEdit 2.0 JS
  • Create a update file that handles the input. For example this PHP: update file
  • In your page add the javascript:
  • Optional: Set fixed vars (like hidden elements in a field post). These will be posted with the editable field so you can identify a user/session.
  • Last step: in your HTML for any editable field add a SPAN around it.

You’re done!

How it works

A small piece of javascript reads all SPAN tags, checks if it has class=“editText” and a id=. If that is true, it adds a onclick function. That onclick function will create a textfield or input (depending on the size of the editable text). Someone has the ability to edit the field. When the text field is blurred, it will read the contents, and starts a XMLHttpRequest and ‘sends’ the content + fieldname + any set vars to an update file. That file will update your database, and reply with the newly set text and the textfield will disappear again.

Compatibility

This script works in Internet Explorer, Firefox, Chrome, Opera and Safari.

Update hack

If you want to force a textarea over a textfield (for example to edit a piece of HTML) use class=“editText” offsetHeight=“10”.

If you want to PUSH an ID to your script I use: id=“edit_userID_$userID”. In your update script, strip the text, and keep the $userID. Et voila.

Posted by

What is going wrong with msn search

Microsoft bashing is easy… but when they say stuff like “Microsoft (search) will be better then Google within 6 months”, I say… bullsh*t.

Everybody knows this screen when entering a non-existent url:
image

Think one second and wonder if you ever had any relevant results? Well I’ve never had any in the past three years. How easy is it to make this function work? They just don’t care: the only way I get in a MSN network site is by this hijack, and it’s never been helpful. First impression really count… and by the way, the MSN new search results still suck: they already had their update a few months ago, so why didn’t they do it right the first time if it was so easy to beat Google?

Posted by

SERP correct domain 302 redirect website solution

A lot of talk in the SEO world on what to do with the problems that search engines have with the same site (page) having listed multiple url versions in the SERPS. For example:

  • https://yvoschaap.com
  • https://yvoschaap.com
  • https://yvoschaap.com/index.php

Could all be viewed as different pages by search engines, while it’s not. This resulted for some sites in a “duplicate content” penalty. I advise you guys to read more about this here or here.

My general solution to this is this simple and small piece of PHP. So NO htaccess stuff, or manual redirecting, etc.

$domain = "www.yvoschaap.com";
if($_SERVER['HTTP_HOST'] != $domain || preg_match("#index.php#i",$_SERVER['HTTP_HOST'])){
  header("HTTP/1.1 301 Moved Permanently");
  header("Location: http://".$domain.$_SERVER['REQUEST_URI']);
  exit();
}

It says this: if host isn’t the “www” version OR if it is the /index.php version. Show an official “moved” page, with the right location: https://yvoschaap.com/
Just put it in the top of your index.php file, et voila. No rocket science…

Posted by

Got myself some gadgets...

Today I felt like spending some money and that brought me to buy a home cinema projector (CONTRAST 5000:1, 1100 ANSI-LUMEN, HD RES) and with that an Xbox360. And WOW, thats crazy cool.

If you bring A DVD or XBOX game you can come and watch or play on my new 3 meter screen.

And Microsoft did a great job on the Xbox. I don’t have games yet, and was hoping for build-in WiFi, but having the Xbox (with the full screen HD projector) control your media files on your computer works! And easy! No weird settings, all plug-and-play, and within a minute I am listening to some album on my PC, through the local network, via the Xbox to my stereo, controlled on the toilet remotely. Take that…

Update: Okay, I have to get back on the Xbox… somehow it has some big ass glitches: sync (video: audio) with playing DVDs is almost always wrong! Xbox live? It has never really worked with need for speed. It has no good connection, and it freezes up. And last but not least, when playing a DVD in HDTV format you get some weird stripes throughout every movie you try to play.

Posted by

CSS: Star Rater Ajax Version

So, I found this great star rater script made in css. But I missed the web 2.0 stuff. So I played around with it to make it work on a database without having to refresh any pages (but updating the database with AJAX). In this version I use it for rating a image (with unique ID = imgId).
Screenshot:

not a beginners tutorial - and just follow the steps


1) The star image:

Star rate image (use save as..)

2) Javascript part (to do the dynamic stuff):

Javascript with AJAX module + rateImg function. (use save as..)

3) PHP part: Create a update.php file (to do the database update with the user rating):

if($_GET[‘rating’] && $_GET[‘imgId’]){
$dbh=mysql_connect ("localhost", "#######", ""#######", ") or die ('I cannot connect to the database because: ' . mysql_error());
mysql_select_db (""#######", ");
$imgId = $_GET['imgId']; //clean up variable from exploits (e.g. via is_numeric(), or addslashes())

if(is_numeric($_GET['rating'])){
//adds the rating to imgID in the database
$update = "update vote set voteNr = voteNr + 1, voteValue = voteValue + ".$_GET['rating']." WHERE imgId = '".$imgId."'";
$result = mysql_query($update);
if(mysql_affected_rows() == 0){
//creates a new row if imgID has no own row yet
$insert = "insert into vote (voteNr,voteValue,imgId) values ('1','".$_GET['rating']."','$imgId')";
$result = mysql_query($insert);
//Assume OK, return some text . current rating?
}else{
//OK return some text / current rating?
}
}

4) Mysql part: (to create a table).

CREATE TABLE `vote` (
        `voteNr` int(8) NOT NULL default ‘0’,
        `voteValue` int(8) NOT NULL default ‘0’,
        `imgId` varchar(100) NOT NULL default ‘’,
        UNIQUE KEY `imgId` (`imgId`)
) TYPE=MyISAM;

5) CSS part
Stylesheet part (creates onmouseover stars):

Stylesheet to create stars + mouseover. (use save as..)

HTML/PHP PART
An example php code to retrieve the current rating from the database: $rating = getCurrenRating(‘12’);

function getCurrenRating($imgId){

$sql= "select * from tblVote WHERE imgId= '".addslashes($imgId)."' LIMIT 0, 1";
$result=@mysql_query($sql);
$rs=@mysql_fetch_array($result);

return @round($rs[voteValue] / $rs[voteNr],1);

}

$rating is the rounded rating taken from database.

$imgId is the unique id for this (image) item. This is used in the javascript, passed to the update.php file to update the database.

And don’t forget to include the rating.js file.

<div id="rating">
<h3>Rating:</h3>
<pre>
<ul class=‘star-rating’>
<li class=‘current-rating’ id=‘current-rating’ style=‘width: <? $ratingpx = $rating *25; echo $ratingpx;?>px’><!—Currently <? echo $rating ?>/5 Stars.—></li>
<li><a href="javascript:rateImg(1,’<? echo $imgId ?>’)" title=‘1 star out of 5’ class=‘one-star’>1</a></li>
<li><a href="javascript:rateImg(2,’<? echo $imgId ?>’)" title=‘2 stars out of 5’ class=‘two-stars’>2</a></li>
<li><a href="javascript:rateImg(3,’<? echo $imgId ?>’)" title=‘3 stars out of 5’ class=‘three-stars’>3</a></li>
<li><a href="javascript:rateImg(4,’<? echo $imgId ?>’)" title=‘4 stars out of 5’ class=‘four-stars’>4</a></li>
<li><a href="javascript:rateImg(5,’<? echo $imgId ?>’)" title=‘5 stars out of 5’ class=‘five-stars’>5</a></li>
</ul>
</pre>

Completely confused? See example here (scroll down) http://www.guidetobuy.info/product6-beamers.html.

NEW: Simple inline text edit!



Posted by

Mercedes cls

Would be a nice goal to set in 2006: mercedes cls (55amg), together with no more smoking.
image
Wow: fast, classy, expensive, comfortable, and very beautiful.

Posted by

1 2 3 4 5