Ruby: Net::Http and open-uri
August 13, 2006Ruby has different libraries that provide higher-level access to network protocols such as FTP, HTTP or HTTPS. This article shows how to use the net::http, net::https, open-uri and the rio library.
open-uri
open-uri is part of the ruby standard library. It enhances the Kernel.open method and is a wrapper for the net::http, net::https and net::ftp packages.
open.rb:
require 'open-uri'
require 'pp'
open('http://www.juretta.com/') do |f|
# hash with meta information
pp f.meta
#
pp "Content-Type: " + f.content_type
pp "last modified" + f.last_modified.to_s
no = 1
# print the first three lines
f.each do |line|
print "#{no}: #{line}"
no += 1
break if no > 4
end
end
Running this code results in:
powerbook:~ sts$ ruby open.rb
{"last-modified"=>"Sun, 13 Aug 2006 17:46:36 GMT",
"x-cache"=>"MISS from www.juretta.com",
"date"=>"Mon, 14 Aug 2006 05:32:54 GMT",
"etag"=>"1864126947",
"content-type"=>"text/html",
"server"=>"lighttpd/1.3.13",
"content-length"=>"33242",
"accept-ranges"=>"bytes"}
"Content-Type: text/html"
"last modifiedSun Aug 13 19:46:36 CEST 2006"
1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
HTTPS with basic authentication using Net::HTTPS
The following example uses the net::https library to access the del.icio.us API which uses SSL and Basic Authentication.
require 'net/https'
require "rexml/document"
username = "" # your del.icio.us username
password = "" # your del.icio.us password
resp = href = "";
begin
http = Net::HTTP.new("api.del.icio.us", 443)
http.use_ssl = true
http.start do |http|
req = Net::HTTP::Get.new("/v1/tags/get", {"User-Agent" =>
"juretta.com RubyLicious 0.2"})
req.basic_auth(username, password)
response = http.request(req)
resp = response.body
end
# XML Document
doc = REXML::Document.new(resp)
# iterate over each element <tag count="200" tag="Rails"/>
doc.root.elements.each do |elem|
print elem.attributes['tag'] + " -> " \
+ elem.attributes['count'] + "\n"
end
rescue SocketError
raise "Host " + host + " nicht erreichbar"
rescue REXML::ParseException => e
print "error parsing XML " + e.to_s
end
Net::HTTP with Hpricot
The following example shows the usage of Net::HTTP. Hpricot is used to parse the html and return selected elements (Although it is recommended to use open-uri instead).
Hpricot is a nice, loose HTML parser for Ruby, written in C.
require 'net/http'
require 'uri'
require 'rubygems'
# use 'gem install hpricot --source code.whytheluckystiff.net'
# to install hpricot
require 'hpricot'
require 'pp'
# Use Net::HTTP to fetch some html
html = Net::HTTP.get(URI.parse('http://www.juretta.com/log/'))
# use hpricot
doc = Hpricot(html)
# get all entries
doc.search("//div[@class='entry']/h3/a").each do |a|
print a.inner_html + "\n -> " + a.attributes['href'] + "\n\n"
end
rio
rio is yet another convenience class wrapping library. It uses open-uri to access network streams and allows easy handling of all kinds of different input and output streams.
# (sudo) gem install rio
require 'rubygems'
require 'rio'
# open an URI and copy the content into a file
rio('http://www.juretta.com/') > rio('juretta_index.html')