Whatpm::ContentType - HTML5 Content Type Sniffer


  ## Content-Type Sniffing
  require Whatpm::ContentType;
  my $sniffed_type = Whatpm::ContentType->get_sniffed_type (
    get_file_head => sub {
      my $n = shift;
      return $first_n_bytes_of_the_entity;
    http_content_type_byte => $content_type_field_body_of_the_entity_in_bytes,
    supported_image_types => {
      'image/jpeg' => 1, 'image/png' => 1, 'image/gif' => 1, # for example


The Whatpm::ContentType module contains media type sniffer for Web user agents. It implements the content type sniffing algorithm as defined in the HTML5 specification.


$sniffed_type = Whatpm::ContentType->get_sniffed_type (named-parameters)

Returns the sniffed type of an entity. The sniffed type is always represented in lowercase.

In list context, this method returns a list of official type and sniffed type. Official type is the media type as specified in the transfer protocol metadata, without any parameters and in lowercase.

Arguments to this method MUST be specified as name-value pairs. Valid named parameters defined for this method is as follows:

content_type_metadata => media-type

The Content-Type metadata, in character string, as defined in HTML5. The value of this parameter MUST be an Internet Media Type (with any parameters), that match to the media-type rule defined in RFC 2616.

If the http_content_type_byte parameter is specified, then the content_type_metadata parameter has no effect. Otherwise, the content_type_metadata parameter MUST be specified if and only if any Content-Type metadata is available.

get_file_head => CODE

The code reference used to obtain first $n bytes of the entity sniffed. The value of this parameter MUST be a reference to a subroutine that returns a string.

This parameter MUST be specified. If missing, an empty (zero-length) entity is assumed.

When invoked, the code receives a parameter $n that represents the number of bytes expected. The code SHOULD return $n bytes at the beginning of the entity. If more than $n bytes are returned, then $n + 1 byte and later are discarded. The code MAY return a string whose length is less than $n bytes if no more bytes is available.

has_http_content_encoding => boolean

This parameter is obsolete and has no effect.

http_content_type_byte => Content-Type-field-body

The byte sequence of the field-body part of the HTTP Content-Type header field of the entity.

This parameter MUST be set to the byte sequence of the Content-Type header field's field-body of the entity if and only if it is transfered over HTTP and the HTTP response entity contains the Content-Type header field.

supported_image_types => {media-type => boolean, ...}

A reference to the hash that contains the list of supported image types.

This parameter MUST be set to a reference to the hash whose keys are Internet Media Types (without any parameter) and whose values are whether image formats with those Internet Media Types are supported or not. A value MUST be true if and only if the Internet Media Type is supported.

If this parameter is missing, then no image types are considered as supported.


HTML5 - Determining the type of a new resource in a browsing context <>


Wakaba <>.


Copyright 2007-2008 Wakaba <>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.