FTP has two modes of
operation: text and binary.
In text mode some characters may be changed when
transferred.
(This is considered a feature
and not a bug since text files
have different end of line
conventions on different
platforms.)
For most files but especially graphics, applets, and other media,
you need to use binary mode.
Oddly for text files the mode usually doesn't matter.
So make sure you uploaded your files in binary
mode.
Another problem is having connection issues.
These can occur when you computer or your ISP
has security barriers in place, such as virus scanners and
firewalls.
(It is also common to make a typo in the hostname, username,
or password, so always check that first!)
FTP is a strange protocol, invented before security
was a big concern.
It requires two separate connections between the two computers.
The normal
mode of operation is called active
mode
.
In active mode the second connection is made from
the server back to your computer.
Most firewalls will block this!
To work around this issue you must use FTP in passive
mode
.
In passive mode your computer makes both required
connections to the server,
which most firewalls will allow.
So make sure passive mode (or passive transfers)
is selected before trying to connect.
Another issue is the character encoding used in your web page. All text is actually represented by a set of numbers, on all computers. Which number represents which character is called the encoding or the character encoding.
When you use a text editor such as Notepad (or TextEdit)
the text files are encoded using some platform default
encoding.
On some modern systems this default is Unicode
or UTF-16
.
Sadly this doesn't match the default encoding used by most
web browsers!
The result is the numbers are interpreted incorrectly and you see
all sorts of junk on the screen.
There are many hundreds of different encoding schemes used in
the world today.
Some of the common encodings are UTF-8
and 8859-1
(also called
ISO-8859-1
or
ISO Latin I
).
Windows systems have always used Microsoft encodings that
Microsoft and IBM calls code pages.
By default Windows XP uses an encoding called
CP-1252
or
Windows-1252
, and old DOS
systems used CP-437
.
Microsoft likes to call the default Windows encoding
ANSI
, perhaps to pretend it is some sort
of national standard encoding.
(I guess it is sort of standard considering the number of
Windows systems in use in the world today.)
I found this information at
scripts.sil.org/IWS-Chapter03:
When Windows was being developed, the American National Standards Institute (ANSI) was in the process of drafting a standard that eventually became ISO 8859-1
Latin 1. Microsoft created their codepage 1252 for Western European languages based on an early draft of the ANSI proposal, and began to refer to this asthe ANSI codepage. Codepage 1252 was finalised before ISO 8859-1 was finalised, however, and the two are not the same: codepage 1252 is a superset of ISO 8859-1.Later, apparently around the time of Windows 95 development, Microsoft began to use the term
ANSIin a different sense to mean any of the Windows codepages, as opposed to Unicode. Therefore, currently in the context of Windows, the termsANSI textorANSI codepageshould be understood to mean text that is encoded with any of the legacy 8-bit Windows codepages rather than Unicode. It really should not be used to mean the specific codepage associated with the US version of Windows, which is codepage 1252.
I don't currently have a Mac or Vista but I am seeing a large
number of student web pages encoded as Unicode
(UTF-16
) and I suspect that is the
new default on at least one of these platforms.
Using a different encoding than
the web browser expects will likely make your page look
bad (or completely unreadable).
The fix is very simple:
Choose
in Notepad
and select an encoding such as Save As...
UTF-8
or ISO-8859-1
.
Then re-upload your web pages, making sure to use the
binary mode transfers option.
It is possible to add an HTML tag to a web page to
indicate the encoding used.
However some web servers over-ride that and tell the browser
this page uses the XYZ encoding
so setting it in the web page won't always help.
To indicate the encoding used on some web page, add the following
tag in the HEAD
section of the page:
<meta http-equiv="Content-Type" content="text/html; charset=encoding">
And replace encoding with
,
utf-8
, iso-8859-1
,
windows-1251
, or whatever encoding you used to create
that web page.
The official list of encoding scheme names can be found at:
www.iana.org/assignments/character-sets.
utf-16
To view a page that has a weird encoding you can tell the
browser to use that encoding.
Under the View
menu of your web browser you can change
the encoding used by the browser.
When I see a page that doesn't look right I try
ISO-8859-1
or UTF-8
and usually one of those will work fine.
UTF-16 uses two bytes per character, not one.
So when you see every other characters is a weird character
(On my system a black diamond with a question mark in it)
it is likely that it was encoded as UTF-16 and
your web browser is set to iso-8859-1
,
utf-8
, or Windows-1252
.
(If your web pages look normal on your system it is because the web browser uses the system default encoding when viewing local files. Once you upload your web pages the default encoding is set by the web server instead, usually UTF-8 or ISO-8859-1.)
A font (for the purpose of this discussion) is
a collection of tiny graphics, each associated with a number in
some encoding.
For example most fonts associate the number
with an upper-case letter
65
.
Since there are potentially millions of characters, a given font
only has graphics for some subset of those characters
(a few hundred).
If you see a box or a weird question mark symbol it is sometimes
because you used some character that the current font doesn't
have a graphic for.
A
This can be a problem since not all users have the same fonts installed. In that case a web browser will substitute the unknown font for one that is installed. So if you use a fancy font in a web page and it looks fine on your screen, it may look awful on some other user's system if they don't have those fonts installed!
Fonts generally are not free, so Microsoft, Red Hat, Apple, and other computer vendors pay a license fee for the fonts they bundle on their systems. The result is different systems almost always have different sets of fonts installed on them.
The best advise is to use fewer fonts, ones that you believe will be available to your audience. Provide an alternative font and make sure your web pages look okay in that default font.
This isn't intended as a full discussion of fonts but you
should know there are font families that are fonts
with similar characteristics.
You can specify the family to use if some specific font is not
installed, and the system will pick an appropriate one.
Here's an example of specifying styles for paragraphs.
The style for paragraphs says to use the Georgia
font,
and if not available try Times New Roman
instead,
and if that isn't available either, to pick some default font in
the serif
font family.
<style> p { font: Georgia, "Times New Roman", serif; } </style>
(The above goes into the HEAD section of a web page.)
See Fonts.htm for some more details on fonts.
Many characters that are legal in a filename are not legal in
a URL or web link
.
The most common problem is with spaces in the filenames.
It is easier to just use letters and digits (plus the extension)
for naming files, then you don't need to worry.
(While many web browsers are forgiving
about such errors
and will try to guess what you meant, not all browsers are so nice.)
If you do include any unusual characters in your filenames, they
should be encoded using what is called URL
encoding
or sometimes percent encoding
.
You simply replace each special character with a percent sign
followed by two hex (hexadecimal) digits.
The two digits indicate what the character was.
For example if you have saved an image file with the name
, the space must be
encoded and the New York.gif
IMG
tag would look
something like this:
<IMG src="New%20York.gif">
You can view this URL encoding reference from w3schools.com for a list of characters and their encoded equivalence. (Note even normal letters and digits can be encoded, but there is no point to doing that.)
A common problem is having images not show up when you view your web page. Here are several common reasons images might not show up:
foo.gif
then you must
have an IMG tag like this:
<IMG SRC="foo.gif">
If the file is really named
,
Foo.gif
, foo.jpg
, or
anything else, the web browser won't be able to find it.
The file name used in the IMG tag must exactly match the file's actual name.
foo.gif.gif
My Documents, the images won't be found.
C:\Documents and Settings\user\Desktop\image project\foo.gif
,
which is an absolute pathname.
This would be a big mistake!
The web page won't work when uploaded to Blackboard.com, and then downloaded
to your instructor's computer.
Just use the name of the file itself. .gif
, .jpg
,
or .png
).
Use a simple, short name, that contains nothing except letters and digits
(and the extension).
Note some folk have Windows set to hide extensions, so their files
look
like they have the name
,
when they really have the name foo.gif
.
There is a way to turn off this Windows foo.gif.gif
feature
, so you can see
the entire name's of files.