A common problem when putting content on a web server is
that common text files contain characters that have
special meaning by a web browser.
These include an ampersand (&
), a less-than symbol
(<
), a greater-than symbol (>
), and others.
In addition valid HTML or XHTML documents require some information at the beginning (a document prolog) and some more at the end (the document epilog). (XHTML is a more modern version of HTML; today's web browsers understand both formats.)
In this project you will write a Perl script that transforms a plain text file into a valid XHTML file.
Create a Perl script that reads text from a file whose name is
provided on the command line, and produces a valid XHTML
document as output.
The title of the document will be the name of the file.
For example, if a text file named
contains the following text:
hello
Hello, World & Class! <Good-Bye!>
Then the XHTML encoded output should look like this:
1. <?xml version="1.0" encoding="UTF-8"?> 2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 3. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 4. <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 5. <head> 6. <title>hello</title> 7. </head> 8. <body> 9. <pre> 10. Hello, World & Class! 11. <Good-Bye!> 12. </pre> 13. </body> 14. </html>
The spacing of lines 1 to 9 (the XHTML required document prolog) and lines 12 to 14 (the required document epilog is for readability only and not required.
Your script must make the following changes to the input:
& to
&. < to
<. > to
>. ARGV[0].filter.pl
can be used as a model for your script.
(A copy can be found on YborStudent in ~wpollock/bin.) print statements, you can
learn
about Perl's here document.
A copy of your Perl script.
A sample text file you can use for testing your script
is available on YborStudent.hccfl.edu at
~wpollock/mycat.c.
You can type or send as email to . Please see your syllabus for more information about submitting projects.