Structure
Even well-formed HTML pages are harder to process than they should be because of the lack of structure. You have to figure out where the section breaks go by analyzing header levels. Sidebars, footers, headers, navigation menus, main content sections, and individual stories are marked up by the catch-alldiv
element. HTML 5 adds new elements to specifically identify each of these common constructs:section
: A part or chapter in a book, a section in a chapter, or essentially anything that has its own heading in HTML 4header
: The page header shown on the page; not the same as thehead
elementfooter
: The page footer where the fine print goes; the signature in an e-mail messagenav
: A collection of links to other pagesarticle
: An independent entry in a blog, magazine, compendium, and so forth
New Elements in HTML5
The following elements have been introduced for better structure:section
represents a generic document or application section. It can be used together with theh1
,h2
,h3
,h4
,h5
, andh6
elements to indicate the document structure.article
represents an independent piece of content of a document, such as a blog entry or newspaper article.main
can be used as a container for the dominant contents of another element, such as the main content of the page. In W3C HTML5 and W3C HTML 5.1, only one such element is allowed in a document.aside
represents a piece of content that is only slightly related to the rest of the page.- In WHATWG HTML,
hgroup
represents the header of a section. header
represents a group of introductory or navigational aids.footer
represents a footer for a section and can contain information about the author, copyright information, etc.nav
represents a section of the document intended for navigation.figure
represents a piece of self-contained flow content, typically referenced as a single unit from the main flow of the document.<figure> <video src="example.webm" controls></video> <figcaption>Example</figcaption> </figure>
figcaption
can be used as caption (it is optional).
video
andaudio
for multimedia content. Both provide an API so application Web developers can script their own user interface, but there is also a way to trigger a user interface provided by the user agent.source
elements are used together with these elements if there are multiple streams available of different types.track
provides text tracks for thevideo
element.embed
is used for plugin content.mark
represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.progress
represents a completion of a task, such as downloading or when performing a series of expensive operations.meter
represents a measurement, such as disk usage.time
represents a date and/or time.- WHATWG HTML and W3C HTML5.1 has
data
, which allows content to be annotated with a machine-readable value. dialog
for showing a dialog.ruby
,rt
, andrp
allow for marking up ruby annotations.bdi
represents a span of text that is to be isolated from its surroundings for the purposes of bidirectional text formatting.wbr
represents a line break opportunity.canvas
is used for rendering dynamic bitmap graphics on the fly, such as graphs or games.menuitem
represents a command the user can invoke from a popup menu.details
represents additional information or controls which the user can obtain on demand. Thesummary
element provides its summary, legend, or caption.datalist
together with the a newlist
attribute forinput
can be used to make comboboxes:<input list="browsers"> <datalist id="browsers"> <option value="Safari"> <option value="Internet Explorer"> <option value="Opera"> <option value="Firefox"> </datalist>
keygen
represents control for key pair generation.output
represents some type of output, such as from a calculation done through scripting.
input
element’s type
attribute now has the following new values:The idea of these new types is that the user agent can provide the user interface, such as a calendar date picker or integration with the user’s address book, and submit a defined format to the server. It gives the user a better experience as his input is checked before sending it to the server meaning there is less time to wait for feedback.
New Attributes
Several attributes have been introduced to various elements that were already part of HTML4:- The
a
andarea
elements have the newdownload
attribute in WHATWG HTML and W3C HTML5.1. WHATWG HTML also has theping
attribute. - The
area
element, for consistency with thea
andlink
elements, now also has thehreflang
,type
andrel
attributes. - The
base
element can now have atarget
attribute as well, mainly for consistency with thea
element. (This is already widely supported.) - The
meta
element has acharset
attribute now as this was already widely supported and provides a nice way to specify the character encoding for the document. - In WHATWG HTML and W3C HTML5.1, the
table
element now has asortable
attribute and theth
element has asorted
attribute, which provide a means to sort table columns. - A new
autofocus
attribute can be specified on theinput
(except when thetype
attribute ishidden
),select
,textarea
andbutton
elements. It provides a declarative way to focus a form control during page load. Using this feature should enhance the user experience compared to focusing the element with script as the user can turn it off if the user does not like it, for instance. - A new
placeholder
attribute can be specified on theinput
andtextarea
elements. It represents a hint intended to aid the user with data entry.<input type=email placeholder="a@b.com">
- The new
form
attribute forinput
,output
,select
,textarea
,button
,label
,object
andfieldset
elements allows for controls to be associated with a form. These elements can now be placed anywhere on a page, not just as descendants of theform
element, and still be associated with aform
.<table> <tr> <th>Key <th>Value <th>Action <tr> <td><form id=1><input name=1-key></form> <td><input form=1 name=1-value> <td><button form=1 name=1-action value=save>✓</button> <button form=1 name=1-action value=delete>✗</button> ... </table>
- The new
required
attribute applies toinput
(except when thetype
attribute ishidden
,image
or some button type such assubmit
),select
andtextarea
. It indicates that the user has to fill in a value in order to submit the form. Forselect
, the firstoption
element has to be a placeholder with an empty value.<label>Color: <select name=color required> <option value="">Choose one <option>Red <option>Green <option>Blue </select></label>
- The
fieldset
element now allows thedisabled
attribute which disables all descendant controls (excluding those that are descendants of thelegend
element) when specified, and thename
attribute which can be used for script access. - The
input
element has several new attributes to specify constraints:autocomplete
,min
,max
,multiple
,pattern
andstep
. As mentioned before it also has a newlist
attribute which can be used together with thedatalist
element. It also now has thewidth
andheight
attributes to specify the dimensions of the image when usingtype=image
. - The
input
andtextarea
elements have a new attribute nameddirname
that causes the directionality of the control as set by the user to be submitted as well. - The
textarea
element also has two new attributes,maxlength
andwrap
which control max input length and submitted line wrapping behavior, respectively. - The
form
element has anovalidate
attribute that can be used to disable form validation submission (i.e. the form can always be submitted). - The
input
andbutton
elements haveformaction
,formenctype
,formmethod
,formnovalidate
, andformtarget
as new attributes. If present, they override theaction
,enctype
,method
,novalidate
, andtarget
attributes on theform
element. - In WHATWG HTML and W3C HTML5.1, the
input
andtextarea
have aninputmode
attribute. - The
menu
element has two new attributes:type
andlabel
. They allow the element to transform into a menu as found in typical user interfaces as well as providing for context menus in conjunction with the globalcontextmenu
attribute. - In WHATWG HTML and W3C HTML5.1, the
button
element has a newmenu
attribute, used together with popup menus. - The
style
element has a newscoped
attribute which can be used to enable scoped style sheets. Style rules within such astyle
element only apply to the local tree. - The
script
element has a new attribute calledasync
that influences script loading and execution. - The
html
element has a new attribute calledmanifest
that points to an application cache manifest used in conjunction with the API for offline Web applications. - The
link
element has a new attribute calledsizes
. It can be used in conjunction with theicon
relationship (set through therel
attribute; can be used for e.g. favicons) to indicate the size of the referenced icon, thus allowing for icons of distinct dimensions. - The
ol
element has a new attribute calledreversed
. When present, it indicates that the list order is descending. - The
iframe
element has three new attributes calledsandbox
,seamless
, andsrcdoc
which allow for sandboxing content, e.g. blog comments. - The
object
element has a new attribute calledtypemustmatch
which allows safer embedding of external resources. - The
img
element has a new attribute calledcrossorigin
to use CORS in the fetch and if it is successful, allows the image data to be read with thecanvas
API. In WHATWG HTML and W3C HTML5.1, thescript
element has acrossorigin
attribute to allow script errors to be reported toonerror
with information about the error. WHATWG HTML and W3C HTML5.1 also has thecrossorigin
attribute on thelink
element. - In WHATWG HTML, the
img
element has a new attribute calledsrcset
to support multiple images for different resolutions and different images for different viewport sizes.
accesskey
, class
, dir
, id
, lang
, style
, tabindex
and title
. Additionally, XHTML 1.0 only allowed xml:space
on some elements, which is now allowed on all elements in XHTML documents.There are also several new global attributes:
- The
contenteditable
attribute indicates that the element is an editable area. The user can change the contents of the element and manipulate the markup. - The
contextmenu
attribute can be used to point to a context menu provided by the Web developer. - The
data-*
collection of Web developer-defined attributes. Web developers can define any attribute they want as long as they prefix it withdata-
to avoid clashes with future versions of HTML. These are intended to be used to store custom data to be consumed by the Web page or application itself. They are not intended for data to be consumed by other parties (e.g. user agents). - The
draggable
anddropzone
attributes can be used together with the new drag & drop API. - The
hidden
attribute indicates that an element is not yet, or is no longer, relevant. - WHATWG HTML and W3C HTML5.1 has the
inert
attribute, intended to makedialog
elements modal. - The
role
andaria-*
collection attributes which can be used to instruct assistive technology. These attributes have slightly different requirements in WHATWG HTML and W3C HTML5/W3C HTML5.1. - The
spellcheck
attribute allows for hinting whether content can be checked for spelling or not. - The
translate
attribute gives a hint to translators whether the content should be translated.
onevent
, global attributes and adds several new event handler attributes for new events it defines; for instance, the onplay
event handler attribute for the play
event which is used by the API for the media elements (video
and audio
).Content Model
The content model is what defines how elements may be nested — what is allowed as children (or descendants) of a certain element.At a high level, HTML4 had two major categories of elements, “inline” (e.g.
span
, img
, text), and “block-level” (e.g. div
, hr
, table
). Some elements did not fit in either category.Some elements allowed “inline” elements (e.g.
p
), some allowed “block-level” elements (e.g. body
), some allowed both (e.g. div
), while other elements did not allow either category but only allowed other specific elements (e.g. dl
, table
), or did not allow any children at all (e.g. link
, img
, hr
).Notice the difference between an element itself being in a certain category, and having a content model of a certain category; for instance, the
p
element is itself a “block-level” element, but has a content model of “inline”.To make it more confusing, HTML4 had different content model rules in its Strict, Transitional and Frameset flavors; for instance, in Strict, the
body
element allowed only “block-level” elements, but in Transitional, it allowed both “inline” and “block-level”.To make things more confusing still, CSS uses the terms “block-level element” and “inline-level element” for its visual formatting model, which is related to CSS’s ‘display’ property and has nothing to do with HTML’s content model rules.
HTML does not use the terms “block-level” or “inline” as part of its content model rules, to reduce confusion with CSS. However, it has more categories than HTML4, and an element can be part of none of them, one of them, or several of them.
- Metadata content, e.g.
link
,script
. - Flow content, e.g.
span
,div
, text. This is roughly like HTML4’s “block-level” and “inline” together. - Sectioning content, e.g.
aside
,section
. - Heading content, e.g.
h1
. - Phrasing content, e.g.
span
,img
, text. This is roughly like HTML4’s “inline”. Elements that are phrasing content are also flow content. - Embedded content, e.g.
img
,iframe
,svg
. - Interactive content, e.g.
a
,button
,label
. Interactive content is not allowed to be nested.
body
element now allows flow content. Thus, This is closer to HTML4 Transitional than HTML4 Strict.Further changes include:
- The
address
element now allows flow content, but with no heading content descendants, no sectioning content descendants, and noheader
,footer
, oraddress
element descendants. - HTML4 allowed
object
inhead
. HTML does not. - WHATWG HTML allows
link
andmeta
as descendants ofbody
if they use microdata attributes. - The
noscript
element was a “block-level” element in HTML4, but is phrasing content in HTML. - The
table
,thead
,tbody
,tfoot
,tr
,ol
,ul
anddl
elements are allowed to be empty in HTML. - Table elements have to conform to the table model (e.g. two cells are not allowed to overlap).
- The
table
element now does not allowcol
elements as direct children. However, the HTML parser implies acolgroup
element, so this change should not affecttext/html
content. - The
table
element now allows thetfoot
element to be the last child. - The
caption
element now allows flow content, but with no descendanttable
elements. - The
th
element now allows flow content, but with noheader
,footer
, sectioning content, or heading content descendants. - The
a
element now has a transparent content model (except it does not allow interactive content descendants), meaning that it has the same content model as its parent. This means that thea
element can now contain e.g.div
elements, if its parent allows flow content. - The
ins
anddel
elements also have a transparent content model. HTML4 had similar rules in prose that could not be expressed in the DTD. - The
object
element also has a transparent content model, after itsparam
children. - The
map
element also has a transparent content model. Thearea
element is considered phrasing content if there is amap
element ancestor, which means that they do not need to be direct children ofmap
. - The
fieldset
element no longer requires alegend
child.
New APIs
HTML introduces a number of APIs that help in creating Web applications. These can be used together with the new elements introduced for applications:- Media elements (
video
andaudio
) have APIs for controlling playback, syncronising multiple media elements, and timed text tracks (e.g. subtitles). - An API for form constraint validation (e.g. the
setCustomValidity()
method). - An API for commands that the user can invoke.
- An API that enables offline Web applications, with an application cache.
- An API that allows a Web application to register itself for certain protocols or media types, using
registerProtocolHandler()
andregisterContentHandler()
. - Editing API in combination with a new global
contenteditable
attribute. - Drag & drop API in combination with a
draggable
attribute. - An API that exposes the components of the document’s URL and allows scripts to navigate, redirect and reload (the
Location
interface). - An API that exposes the session history and allows scripts to update
the document’s URL without actually navigating, so that applications
don’t need to abuse the fragment component for “Ajax-style” navigation
(the
History
interface). - An API for base64 conversion (
atob()
andbtoa()
methods). - An API to schedule timer-based callbacks (
setTimeout()
andsetInterval()
). - An API to prompt the user (
alert()
,confirm()
,prompt()
,showModalDialog()
). - An API for printing the document (
print()
). - An API for handling search providers (
AddSearchProvider()
andIsSearchProviderInstalled()
). - The
Window
object has been defined.
- An API for microdata.
- An API for immediate-mode bitmap graphics (the
2d
context for thecanvas
element). - An API for cross-document messaging and channel messaging (
postMessage()
andMessageChannel
). - An API for runnings scripts in the background (
Worker
andSharedWorker
). - An API for client-side storage (
localStorage
andsessionStorage
). - An API for bidirectional client-server communication (
WebSocket
). - An API for server-to-client data push (
EventSource
).
Comments
Post a Comment