hoodwink.d enhanced
 

juretta.com

Ruby: Net::Http and open-uri | August 13, 2006-->

August 13, 2006

Ruby has different libraries that provide higher-level access to network protocols such as FTP, HTTP or HTTPS. This article shows the usage of net::http, net::https, open-uri and the rio library.

open-uri

open-uri is part of the ruby standard library. It enhances the Kernel.open method and is a wrapper for the net::http, net::https and net::ftp packages.

open.rb:
require 'open-uri'
require 'pp'

open('http://www.juretta.com/') do |f|
  # hash with meta information
  pp  f.meta
   
  # 
  pp "Content-Type: " + f.content_type
  pp "last modified" + f.last_modified.to_s
  
  no = 1
  # print the first three lines
  f.each do |line|
    print "#{no}: #{line}"
    no += 1
    break if no > 4
  end
end
powerbook:~ sts$ ruby open.rb 
{"last-modified"=>"Sun, 13 Aug 2006 17:46:36 GMT",
 "x-cache"=>"MISS from www.juretta.com",
 "date"=>"Mon, 14 Aug 2006 05:32:54 GMT",
 "etag"=>"1864126947",
 "content-type"=>"text/html",
 "server"=>"lighttpd/1.3.13",
 "content-length"=>"33242",
 "accept-ranges"=>"bytes"}
"Content-Type: text/html"
"last modifiedSun Aug 13 19:46:36 CEST 2006"
1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2:         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3: <!--
4:     : application.rhtml 265 2006-08-10 11:20:30Z stefan
HTTPS with basic authentication using Net::HTTPS

The following example uses the net::https library to access the del.icio.us API which uses SSL and Basic Authentication.

require 'net/https'
require "rexml/document"

username = "" # your del.icio.us username
password = "" # your del.icio.us password

resp = href = "";
begin      
  http = Net::HTTP.new("api.del.icio.us", 443)
  http.use_ssl = true
  http.start do |http|
    req = Net::HTTP::Get.new("/v1/tags/get", {"User-Agent" => 
        "juretta.com RubyLicious 0.2"})
    req.basic_auth(username, password)
    response = http.request(req)
    resp = response.body
  end     
  #  XML Document
  doc = REXML::Document.new(resp)    
  # iterate over each element <tag count="200" tag="Rails"/>
  doc.root.elements.each do |elem|
    print elem.attributes['tag']  + " -> " + elem.attributes['count'] + "\n"
  end
  
rescue SocketError
  raise "Host " + host + " nicht erreichbar"
rescue REXML::ParseException => e
  print "error parsing XML " + e.to_s
end
Net::HTTP with Hpricot

The following example shows the usage of Net::HTTP. Hpricot is used to parse the html and return selected elements (Although it is recommended to use open-uri instead).

Hpricot is a nice, loose HTML parser for Ruby, written in C.
require 'net/http'
require 'uri'
require 'rubygems'
# use 'gem install hpricot --source code.whytheluckystiff.net' 
# to install hpricot
require 'hpricot'

require 'pp'

# Use Net::HTTP to fetch some html
html = Net::HTTP.get(URI.parse('http://www.juretta.com/log/'))

# use hpricot
doc = Hpricot(html)

# get all entries
doc.search("//div[@class='entry']/h3/a").each do |a|
  print a.inner_html + "\n  -> " + a.attributes['href'] + "\n\n"
end
rio

rio is yet another convenience class wrapping library. It uses open-uri to access network streams and allows easy handling of all kinds of different input and output streams.

# (sudo) gem install rio
require 'rubygems'
require 'rio'
# open an URI and copy the content into a file
rio('http://www.juretta.com/') > rio('juretta_index.html')

You may want to take a look at curl or wget.

@09:28 | Comments: 0 | Tags: Ruby (32)

Diggman

Related Entries


About

juretta.com is the personal workspace of Stefan Saasen. You can send him an email or read more about this site in the „About“ section.

« Previous entry

Rails: Reload models in script/console
posted about 1 year ago

» Next entry

Java and MySQL on Mac OS X: "java.io....
posted about 1 year ago

Related Entries

Recent comment

On: “New Zealand Daylight Saving Time Change: Mac OS X and Java

very usefull,thanks,i linked it on my blog ispig:hobix

posted 3 months ago by foxcamel

Look!

Latest links  RSS  

More...