Brian Joseph, Angelo Costanzo, and Jonathan Slocum
Albanian is an Indo-European language spoken mainly in the Balkan Peninsula by approximately five million people. It is the principal and official language of Albania, the principal and a co-official language of Kosovo (with Serbian), and the principal and co-official language of many western municipalities of the Republic of Macedonia (with Macedonian). Albanian is also spoken widely in some areas in Greece, southern Montenegro, southern Serbia, and in some towns in southern Italy and Sicily.
The terms Albania and Albanian are exonyms. The Albanians call themselves Shqiptar, their language shqip, and their country Shqipëria. These words are likely derived from the adverb shqip 'clearly' based on Latin excipere (whence shqipoj 'speak clearly'), though there are alternative explanations. In all other languages, a form from earlier *alban- or *arban- is used (the difference being most likely from a rhoticism process in Greek). In most other languages, a form with the same origin as Eng. Albanian is used (e.g., It. Albanese, Serb. Albanac, Germ. Albaner, etc). In Turkish, the Albanians are called Arnavut, derived in some way from arvan-. The terms Albania and Albanian are not to be confused with the area in the Caucasus referred to in ancient texts as Albania or the language spoken there referred to as Albanian (an ancestor of the modern Udi language spoken in Azerbaijan and a member of a language family with no confirmed connections to the Indo-European language family).
When compared with most of the other Indo-European languages, Albanian's first attestations are rather recent, with the first surviving fragment from the mid-15th century and the first major text from the mid-16th century. For this reason, these lessons cover Albanian from the modern standard language back to earlier attestations, starting with the modern variety to get a grounding in the language and working back to older material.
Albanian and Indo-European
Albanian forms a separate branch of Indo-European and cannot conclusively be closely connected with any other Indo-European language. There have been attempts to connect Albanian with some of the sparsely attested ancient languages of the Balkans, particularly Illyrian but also Dacian and Thracian. While this is plausible geographically, given that we know the Illyrians lived in an area that includes the modern Albanian-speaking area, there is no concrete linguistic evidence for any of these proposals. Some have proposed a connection between the ancestor of Albanian (without assigning a specific identity to this ancestor) and a Latinized variety of that ancestor that may have ultimately yielded Romanian, as there are several shared words not of Latin origin in both languages.
Albanians and Albanian in the Historical Record
Mention of the Albanian people and the Albanian language appears rather late in the historical record. The earliest uncontroversial mention of the Albanian people is in Michael Attaleiates's late 11th century history of the Byzantine Empire, where he refers to the Albanoi taking part in a revolt against Constantinople and the Arvanitai as subjects of the duke of Dyrrachium (modern Durrës, Albania's main port on the Adriatic).
The first mentions of the Albanian language predate its first attestation by several centuries. Elsie (1991) describes a 1285 text in which the investigation of a robbery in Ragusa (modern Dubrovnik, Croatia) refers to a witness who said Audivi unam vocem clamantem in monte in lingua albanesca 'I heard a voice crying in the mountains in the Albanian language'. In the 1308 Anonymi Descriptio Europae Orientalis 'Anonymous description of Eastern Europe', the author writes Habent enim Albani prefati linguam distinctam a Latinis, Grecis et Sclavis ita quod in nullo se inteligunt cum aliis nationibus 'The aformentioned Albanians have a language which is entirely distinct from that of the Latins, Greeks and Slavs such that in no way can they communicate with other peoples'.
Earliest Attestations of the Albanian Language
While the earliest attested Albanian texts are from over a century later, the existence of Albanian texts is mentioned in 1332 in Directorium ad passagium faciendum (by a French monk whose identity is uncertain): licet Albanenses aliam omnino linguam a latina habeant et diversam, tamen litteram latinam habent in uso et in omnibus suis libris 'The Albanians have a language different from Latin, although they use Latin letters in their books' (note that this could potentially be saying that Albanians just wrote in Latin).
The oldest unambiguous attested Albanian is a single line embedded in a Latin document from 1462. It is in a letter from Pal Engëlli, a bishop and associate of Skënderbeu, and is a translation of a baptismal formula (formula e pagëzimit) into Geg Albanian:
|Vnte' paghesont premenit Atit et birit et spertit senit|
|'I baptize you in the name of the father, the son, and the holy spirit'|
|cf. Std. Alb. Unë të pagëzoj në emër të Atit, të Birit, e të Shpirtit të Shenjtë|
Over the following century the attested Albanian "texts" are of similar size, including a single line in a Latin play from 1483 and a short list of Albanian words from 1496.
The first larger text is Meshari i Gjon Buzukut 'The Missal of Gjon Buzuku', written in 1555 (see Lesson 5). Again, like the earlier attestations of Albanian, Buzuku's 'Missal' is written in Geg. Most of the early documentation of Albanian is in Geg, as that area was more difficult for the Ottomans to subdue (and consequently discourage the use of Albanian). The earliest attestation of Tosk Albanian is the E mbsuame e krështerë 'Christian doctrine' of Lekë Matrënga from 1592, written in Hora e Arbëreshëvet, an Arbëresh settlement in northeastern Sicily.
Structure of Albanian
Some general characteristics of the Albanian language:
- Albanian shows a fairly complex nominal inflection system. Albanian has a three-gender system (masculine, feminine, neuter), though the exact status of the neuter gender is disputed. Five cases remain from Proto-Indo-European: nominative, accusative, dative, genitive, and ablative, though the dative and genitive are morphologically identical. In addition to inflecting for case and number, Albanian nouns also inflect for definiteness. As is also seen in several other languages of the Balkans (e.g., Romanian, Macedonian, Bulgarian), Albanian has a postposed definite article, e.g., zog 'bird', zog-u 'the bird'
- The verb system is highly populated with analytic forms. This includes several compound past tenses (e.g., kam lexuar 'I have read'), the future tense (e.g., do të lexoj 'I will read'), the present progressive (e.g., po lexoj 'I am reading'), the past passive (e.g., u lexua 'it was read'), among others. In addition, Albanian also has a substantial inventory of synthetic verb forms, some familiar (e.g., present, imperfect, past definite, optative, etc.), and some that are less familiar to learners of Indo-European (e.g., the admirative mood, see Lesson 5).
- One of the most noticeable features of Albanian is the vast number of "small words" that exist. It is not that there is a huge inventory of different "small words" in Albanian; rather there are many instances in which words having the same form are found in different functions. Some of these small words include: the attributive article (or as we call it, nyje), that can take four different forms depending on a variety of factors and is required with most adjectives, some nouns, and all instances of nouns in the genitive case (it is what distinguishes the genitive from the dative); subordinators; weak pronouns; etc. This often gets a bit tricky as, e.g., të can be an attributive article, a pronominal clitic, and a subordinator.
Variation in Albanian
Albanian dialects are traditionally divided into two groups: Geg dialects in the north, and Tosk dialects in the south. The dividing line is traditionally considered to be the Shkumbin river, which runs east-west though central Albania (at approximately the 41st parallel north). Dialects spoken in Kosovo and Macedonia are Geg dialects, while those spoken in northwestern Greece are Tosk dialects. While they are technically Tosk dialects, Arvanitika (spoken in Greece, historically in Attica and Boeotia) and Arbëresh (spoken in southern Italy and Sicily) are also often considered major Albanian dialects; these dialects were brought to these areas after the Ottoman conquest of the western Balkans in the late 15th century, and they are maintained to this day.
Major differences between Geg and Tosk
- Geg has nasal vowels while Tosk does not, e.g., Geg âsht vs. Tosk është 'is'
- Geg has phonemic vowel length, e.g., dhē 'earth' vs. dhe 'and'. Nearly all Tosk dialects lack vowel length distinctions, e.g., dhe 'earth', 'and'.
- Tosk dialects have undergone a change by which intervocalic n became r. No such change has occurred in Geg, e.g., Geg Shqipnia vs. Tosk Shqipëria 'Albania', Geg gjarpën vs. Tosk gjarpër 'snake'.
- The Tosk future tense is formed with the marker do followed by a conjugated present subjunctive form of the verb (e.g., do të shkoj 'I will go'), while the Geg future tense is formed by a conjugated form of the verb 'have' followed by an infinitive (e.g., kam me shkue 'I will go').
- Tosk lacks infinitives altogether (similar to several other languages of the Balkans), while Geg maintains the infinitive (composed of me plus the past participle).
- In Tosk, most verbs have a past participle in -r (e.g., fjetur 'slept', qeshur 'laughed', kërkuar 'requested'). In Geg, no verbs have this ending (e.g., fjetë 'slept', qeshë 'laughed, kërkuë 'requested').
Nearly all of the historical centers of Albanian culture (Durrës, Tiranë, Shkodër, Prishtinë, Tetovë, etc.) are located squarely in Geg-speaking territory. However, Standard Albanian is predominantly based on Tosk. The promotion of a Tosk-based variety as a standard is actually quite recent, and likely has much to do with the fact that Enver Hoxha, Albania's dictator from the 1940s until the 1980s, was from Gjirokastër (in southern Albania), and thus was a native speaker of a Tosk variety. Even though they are predominantly located in Geg-speaking areas, the standard variety used in Kosovo and Macedonia is the same one used in Albania (i.e., it is based on Tosk).
Standard Albanian, while predominantly based on Tosk, does also have some Geg features. For example, the Standard Albanian 1st person singular present verb ending -j is a Geg feature; most Tosk dialects, on the other hand, have the ending -nj.
As with the other languages of the Balkans, the development of Albanian has been drastically affected by contact with speakers of other languages.
While reports of over 90 percent of Albanian's lexicon being composed of foreign words are definitely overstated, lexical borrowing has had an enormous effect on Albanian. There are several strata of lexical borrowings.
- Early Greek influence: Limited to a small group of borrowings, e.g., Ancient Greek makhana > mokërë 'millstone', lakhana > lakër 'cabbage'.
- Latin influence: The influence of Latin on the Albanian lexicon is vast, e.g., Latin lex > ligj 'law', amicus > mik 'friend', aurum > ar 'gold'. Albanian also shows a number of calques from Latin, e.g., decem-brius > dhjet-or 'December', manu-scriptus > dorë-shkrim 'manuscript'.
- South Slavic influence: There are also a substantial number of words borrowed from South Slavic, e.g., Slavic nevolja > Alb. nevojë 'need'; gotov > Alb. gati 'ready'.
- Modern Greek influence: While the Ancient Greek influence on Albanian is minimal, the influence from Modern Greek has been much larger, e.g., Greek kyverno > qeveris 'to govern', krevati > krevet 'bed', staphida > stafidhe 'raisin', as well as the pan-Balkan 'unceremonious mode of address' bre, more (along with several alternate forms, originally from Greek more).
- Turkish influence: As Albania was under Ottoman rule for over 400 years, there is a strong Turkish element in the Albanian lexicon, e.g, Turkish haydi > hajde 'c'mon!; let's go!', pencere > penxhere 'window'; along with a wide range of culinary vocabulary (e.g., patëllxhan 'eggplant'; çorbë 'soup'; byrek 'delicious pastry with a variety of fillings').
- Italian and English influence: Over the past century, the two major influences on the Albanian lexicon have been Italian and English, e.g., Italian bagno > banjë 'bathroom', tavolino > Alb. tavolinë 'table'; English jogging > Alb xhoging, to charge > Alb. çarxhoj.
The Balkan Sprachbund
As part of Balkan Sprachbund, Albanian shares a number of features with the other languages of the Balkans (e.g., Greek, Bulgarian, Macedonian, Romanian, Turkish, Romani, etc). The following are some of Albanian's more notable Balkan features:
- Albanian has a postposed definite article, e.g., qen 'dog', qen-i 'the dog'. This is also seen in Balkan Romance and Balkan Slavic as well (e.g., Mac. kuche 'dog', kuche-to 'the dog'). While many of the features of the Balkan Sprachbund are considered to have ultimately originated in Greek, it has been proposed that Albanian is the source of this particular feature (though it is difficult to tell, as the earliest attestations of Albanian only date back 500 years).
- While it does have a more recent formation that fulfills some of the roles of the infinitive in other languages, Tosk (like Greek, Macedonian, etc.) has lost the infinitive from earlier stages of the language. It is maintained in Geg (see Lesson 4 for a discussion of the Geg infinitive).
- The Tosk future tense is an analytic formation composed of an invariant particle from the verb for 'want' followed by a present subjunctive form of the verb (e.g., do të pi 'I will drink', where do is from the verb dua 'want'). Most of the other Balkan languages have the same pattern (e.g., Grk. tha pino, Mac. k'e pijam, where tha and k'e are invariant particles from the Greek and Macedonian verbs meaning 'want', respectively).
- Albanian has the admirative mood, which is used, among other things, to express shock or surprise (see Lesson 5). This is also seen in Turkish, Bulgarian and Macedonian.
The Albanian Alphabet & Pronunciation
The Albanian Alphabet
The earliest texts were written in various forms of the Latin alphabet, with additional characters borrowed from the Greek alphabet (as well as some additional characters of other origins). Up until the late 19th century, the script used to write Albanian appears to have been dependent on the religion of the scribe: Latin for Catholics, Greek for Orthodox Christians, and Perso-Arabic script for Muslims. In the late 19th century there were various attempts to create a standardized alphabet for Albanian; in 1908, the modern Albanian alphabet was codified at the Congress of Manastir.
The modern Albanian alphabet consists of 36 letters, several of which are digraphs.
As briefly discussed above, Geg has nasalized vowels. The normal convention is to write these vowels with a circumflex accent. All other issues with the alphabet are discussed in the relevant lessons.
Standard Albanian, as well as most Tosk dialects, has a seven-vowel system:
|i||similar to the vowel in Eng. meat|
|e||similar to the vowel in Eng. met|
|a||similar to the vowel in Eng. hot|
|o||similar to the vowel in Eng. boat, but not diphthongal. More akin to the vowel in Spanish no.|
|u||similar to the vowel in Eng. boot|
|y||a high, front, rounded vowel; absent in English; similar to the vowel in French tu|
|ë||similar to the final vowel in Eng. sofa|
In Standard Albanian (as well as in most Geg dialects), the vowel ë is typically not pronounced in final position (e.g., nëntë 'nine' is pronounced nënt), except for in monosyllabic words (e.g., një 'one', që 'that', etc). This sound is also commonly elided in other unstressed syllables. In some (mainly Tosk) dialects, this vowel is fully pronounced.
While Standard Albanian has a relatively simple seven-vowel system, most Geg varieties have a much more complex set of vowels. Any of the vowels above, with the exception of ë, can be nasalized. In addition, Geg has distinctive vowel length, so any of the vowels (except, again ë) can be long or short. Camaj (1984) also claims that some Geg varieties have a distinction between short nasal vowels and long nasal vowels.
As for consonants, though most of the letter-sound correspondences will be familiar, there are some exceptions:
|c||voiceless dental affricate||ts in English cats, z in Italian zio, c in Russian cvet|
|ç||voiceless postalveolar affricate||ch in English choose, c in Italian cento|
|dh||voiced dental fricative||th in English the|
|gj||voiced palatal stop||similar to g in English gear|
|ll||voiced velarized lateral||similar to ll in English ball; in Albanian, unlike in English, this sound can occur in any position in the word.|
|nj||palatal nasal||gn in French agneau, similar to ni in Eng. onion|
|q||voiceless palatal stop||similar to k in Eng. key|
|rr||alveolar trill||rr in Spanish sierra|
|th||voiceless dental fricative||th in English thing|
|x||voiced dental affricate||ds in English needs, z in Italian zero|
|xh||voiced postalveolar affricate||j in English judge, g in Italian giro|
|zh||voiced postalveolar fricative||s in English pleasure, j in French jour|
The Albanian Lessons
- Excerpt from the 2008 Constitution of Kosovo
- Excerpt from Kadare's The General of the Dead Army
- Excerpt from Barleti's biography of Skenderbeg
- Excerpt from the Kanun of Lekë Dukagjinit
- Excerpt from the Missal of Gjon Buzuku
- Show full Table of Contents with Grammar Points index
- Open a Master Glossary window for these Tosk texts
- Open a Master Glossary window for these Geg texts
- Open a Base Form Dictionary window for these Tosk texts
- Open a Base Form Dictionary window for these Geg texts
- Open an English Meaning Index window for these Tosk texts
- Open an English Meaning Index window for these Geg texts
first lesson | next lesson