Ruby: Net::Http and open-uri | August 13, 2006-->
August 13, 2006Ruby has different libraries that provide higher-level access to network protocols such as FTP, HTTP or HTTPS. This article shows the usage of net::http, net::https, open-uri and the rio library.
open-uriopen-uri is part of the ruby standard library. It enhances the Kernel.open method and is a wrapper for the net::http, net::https and net::ftp packages.
open.rb:
require 'open-uri' require 'pp' open('http://www.juretta.com/') do |f| # hash with meta information pp f.meta # pp "Content-Type: " + f.content_type pp "last modified" + f.last_modified.to_s no = 1 # print the first three lines f.each do |line| print "#{no}: #{line}" no += 1 break if no > 4 end end
powerbook:~ sts$ ruby open.rb
{"last-modified"=>"Sun, 13 Aug 2006 17:46:36 GMT",
"x-cache"=>"MISS from www.juretta.com",
"date"=>"Mon, 14 Aug 2006 05:32:54 GMT",
"etag"=>"1864126947",
"content-type"=>"text/html",
"server"=>"lighttpd/1.3.13",
"content-length"=>"33242",
"accept-ranges"=>"bytes"}
"Content-Type: text/html"
"last modifiedSun Aug 13 19:46:36 CEST 2006"
1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3: <!--
4: : application.rhtml 265 2006-08-10 11:20:30Z stefan
HTTPS with basic authentication using Net::HTTPS
The following example uses the net::https library to access the del.icio.us API which uses SSL and Basic Authentication.
require 'net/https' require "rexml/document" username = "" # your del.icio.us username password = "" # your del.icio.us password resp = href = ""; begin http = Net::HTTP.new("api.del.icio.us", 443) http.use_ssl = true http.start do |http| req = Net::HTTP::Get.new("/v1/tags/get", {"User-Agent" => "juretta.com RubyLicious 0.2"}) req.basic_auth(username, password) response = http.request(req) resp = response.body end # XML Document doc = REXML::Document.new(resp) # iterate over each element <tag count="200" tag="Rails"/> doc.root.elements.each do |elem| print elem.attributes['tag'] + " -> " + elem.attributes['count'] + "\n" end rescue SocketError raise "Host " + host + " nicht erreichbar" rescue REXML::ParseException => e print "error parsing XML " + e.to_s end
The following example shows the usage of Net::HTTP. Hpricot is used to parse the html and return selected elements (Although it is recommended to use open-uri instead).
Hpricot is a nice, loose HTML parser for Ruby, written in C.
require 'net/http' require 'uri' require 'rubygems' # use 'gem install hpricot --source code.whytheluckystiff.net' # to install hpricot require 'hpricot' require 'pp' # Use Net::HTTP to fetch some html html = Net::HTTP.get(URI.parse('http://www.juretta.com/log/')) # use hpricot doc = Hpricot(html) # get all entries doc.search("//div[@class='entry']/h3/a").each do |a| print a.inner_html + "\n -> " + a.attributes['href'] + "\n\n" end
rio is yet another convenience class wrapping library. It uses open-uri to access network streams and allows easy handling of all kinds of different input and output streams.
# (sudo) gem install rio require 'rubygems' require 'rio' # open an URI and copy the content into a file rio('http://www.juretta.com/') > rio('juretta_index.html')
