Friday, April 11, 2014

Web Development - HTTP

What is HTTP and Why it is required to understand?

HTTP means  Hypertext Transfer Protocol. HTTP is the protocol that lets us search for a book on google and buy one from flipkart.com. It's also the protocol that lets us reunite with old friends on a Facebook chat or watch a music video on you tube.It's a protocol that allows a web server from a data center in the United States to ship information to an Internet cafe in India where a student can read a web page describing the great wall of China. I.
Having a solid understanding of HTTP can help you write better web applications and web services. It can also help you debug applications and services when things go wrong.


Uniform Resource Locator

It represents HTTP address. whenever we need to buy a book to study we open a web browser and enter http://amazon.com in address bar and there we search for book we want.
Here our browser makes Http request to server name amazon.com and then that get response from server according to availability.
http://www.amazon.com it is called as URL which represent specific resource on web. Resources may be image,pages,files or video or anything to which we want to interact. Two different resources on web will have two different URL.

When we write any URL , we can break the URL into three parts.
1st there is http part before colon and double slash it is called as URL scheme and describes how to access particular resource. there can be other URL scheme like HTTPs it describes secure protocol. 

Everything after colon and double slash will be specific to specific to particular scheme.

The first part after colon slash is amazon.com that is called as host. this part tells browser which computer on internet is hosting the resource. computer will use the domain name system to look up an address for amazon.com and turn that into network address and then it will know exactly where to find a request.
Host portions can be also specified using IP address like 204.78.50.82

The last path of URL is URL path which tells which specific resources is requested by this path. That path can point to the specific resource present in real on server.some path can represent dynamic content by using content from database. 


These days websites try to avoid to have any real file name in URL. Many websites try to place key word in URL, which helps in search engine optimization that will rank the resource higher in search engine results.

Including URL scheme,host name, URL path , URLs also contain port number that host is going to use to listen for Http request. The default port for Http is port 80.
consider another URL as : http://google.com/search?q=java
Here everything after ? mark is called as query or query string. This query Sting contain information for the website and it is interpreted by the application and its interpretation will tell what resource you want.The query string always appears in name value pairs.

consider another URL as : http://wikipedia.org/wiki/java#description
here everything after # symbol called as fragments and when you will type in address of browser , this portion of website will appear on top of screen.

so the normal url can be represented as : 
http://host:80/path?q=query#fragment .

URL Encoding
It helps us to understand the character and encoding issue with URLs like which characters are safe for URLs and which characters are unsafe for URLs.
The space, pound sign and caret these are unsafe characters for the URLs. RFC 3986  provides rules for URLs. Uppercase letters, lower case letters, digit and astresik, paranthesis,dollar sign, underscore these are safe characters for URLs.

We can also include unsafe characters in URLs using URL encoding.
Example : http://abc.com/rishi%20ranjan.
here 20 is hexadecimal value for space character in US ASCII character set. Almost Every web framework use APIs for URL encoding 



Content type:
As web is full of resources like image, videos and different types of document And when we make request for specific type of resource the host should represent that resource in the way the client want it.

so when host respond to Http request , it returns the resources as well as it also specify content type. 

The content type a server depends on multi-purpose internet main extension or MIME standards. so when the client requests an HTML web page then host respond to request with HTML labels as text/html where text is primary media type and HTML is sub type. when host respond for image request, host label the resource with content type of image/jpeg or gif or png. here text will appear in HTTP response and location where the client can parse it. To deal with such situation and content type we need to configure the MIME type in web server.

Content Negotiation:
suppose we want to visit a site for any kind of recipe, a recipe can be represented in different languages like English,German or may be other and their format for presentation can also differ from HTML to PDF or may be Plain text
In this case when an client makes an HTTP request to a server client can also specify the media types here client can specify what it can accept back from host and on the basis of availability the host will return to the client so that's why it is called as content negotiation.

In terms of language you can see content negotiation while changing language preference in google .com.

This content negotiation plays a vital role when developer deal with web services. A piece of code written in Java script can make request to server and ask for a JSON representation. meanwhile a piece of code written in C++ can make a request to same server to same URL and ask for an XML representation of a resource. And in both case host will satisfy the request and information will arrive at client in ideal format for parsing.


No comments:

Post a Comment