Talk:Formal grammar

Computing Mid‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Mid	This article has been rated as Mid-importance on the project's importance scale.

Computer science Mid‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

Mid

This article has been rated as Mid-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Cognitive science (inactive)

This article is within the scope of WikiProject Cognitive science, a project which is currently considered to be inactive.Cognitive scienceWikipedia:WikiProject Cognitive scienceTemplate:WikiProject Cognitive scienceCognitive science articles

Archives

/Archive 1

Terminology clarification requests[edit]

Please feel free to list any potentially bothersome/burdensome/opaque terminology clarification requests here as the article stands, in order that editors not over-clarify the terms. :-)

Current candidates (see archive 1) are:

"recognize"

-- QTJ 06:02, 6 November 2006 (UTC)[reply]

Analytic / Reductive grammars[edit]

The discussion on the need to include information about reductive grammars was growing too large; I have taken the liberty to move it into a page of its own. Rp (talk) 21:29, 4 September 2015 (UTC)[reply]

I searched the Web for reductive grammar and added more on reductive.

A reductive grammar has to do with expressed meaning of language constructs. It is analytical in nature having rules that define language constructs that express contextual meaning. That doesn't conflict with it also being a productive grammar.

It I think has more to do with how non-terminal are defined to have meaningful names that define language elements, structure or phrase. i.e. An arithmetic expression defined as:

expr = term $(('+'|'-') term);

defines expr to be a term followed by zero or more + or - term. Defined as above it expresses operator hierarchy. That may be all there is to it. Using meaningful names in grammar rules. On the other hand actuall semantic transformation may also be involved. What I got is that a reductive grammar breaks a language into meaningful parts.

Anyway mosey on over to the reductive link above.

Steamerandy (talk) 06:54, 19 September 2015 (UTC)[reply]

I added a section explaining the parser programming language of Shorre metacompilers. I am unable to explain them in linguistic terms. In their documentation they are described as reductive or analytical grammars. I can not say why. I have tried. These languages were from the 1960s.

If maybe someone looking at them as used in programming a parser it might help in classifying them. I use the term Abstract Parse Tree to distinguish how it is produced. It is actually an Abstract Syntax Tree produced by the parser using node and tree operators. Although lists are as easy produced. There is a lisp 2 implementation with in CWIC. Although additional base types spicific to compiler writing have been added. Symbol table and node object are two examples.

With the tree building operators the grammar must be analytical in order to define parsed objects and language structures that are transformed into list structures. Not sure about reductive. Originally thought reductive and analytic to be the same. Maybe reductive is the top down component. Defining in finner and finner granulations. Reducing language to smaller and smaller structure.

Compiler grammar specification

Steamerandy (talk) 05:23, 25 November 2015 (UTC)[reply]

context-free grammar[edit]

"A context-free grammar is a grammar in which the left-hand side of each production rule consists of only a single nonterminal symbol."

Is this correct? It seems different from the definition in Context-free_grammar. Took 10:22, 2 June 2007 (UTC)[reply]

No, it's the same definition. (I'm rather confused by your question, actually — it's not that these are two different definitions that can be proved equivalent, but rather that they're actually the same definition, albeit worded differently. Am I missing something?) —Ruakh_TALK 16:37, 2 June 2007 (UTC)[reply]

The problem is that the definition given in the context-free_grammar article was simply wrong (I have never seen it before, and I suspect it produces a formalism more powerful than context-free grammars.) I have taken the liberty to rewrite it there. Rp 14:58, 24 July 2007 (UTC)[reply]

That's really funny. The definition was right when Took wrote his/her comment, then the definition was changed to be wrong, and you're only now seeing his/her comment, which prompted you to take a look and fix the definition. That works really well. :-) —Ruakh_TALK 15:55, 24 July 2007 (UTC)[reply]

I just realize that I misunderstood some word in the above definition. So everything is fine, and glad to see that my comment (unintentionally) helps correct an error :) Took 12:37, 6 August 2007 (UTC)[reply]

Definition of "language"[edit]

Does the language generated by a grammar consist only of the results of applications of finitely many production rules, or is there a notion of convergence to deal with? - Brad 65.93.148.11 05:29, 19 August 2007 (UTC)[reply]

The language generated consists of words obtainable by a finite number of production rule applications. There is no natural notion of convergence that would allow one to consider infinite number of production rule applications in the context of formal grammars. — Carl (CBM · talk) 15:35, 19 August 2007 (UTC)[reply]

Regular Grammar typo[edit]

Just changed the left hand side of rule 6 in the regular grammar example from S to B. With S, the grammar can derive the empty string (which it shouldn't), and every non-empty derived string would contain at least 2 b's (so, e.g., ab could not be derived). So it looks very much like a typo where the S should simply be changed to B.

However, the parenthesis to the right of rule 6 is now inconsistent (because it talks about S, and the rule does not contain S any more). I'm not sure how to resolve that, so I left it unchanged. Anybody who knows exactly how to fix that?

ErikErnst 16:13, 25 September 2007 (UTC)[reply]

After a bit more research there seems to be no evidence supporting the statement made in the parentheses to the right of rule 6, so I deleted it.

ErikErnst 16:43, 25 September 2007 (UTC)[reply]

S and alphabet[edit]

From the current example: This process is repeated until we only have symbols from the alphabet (i.e., $a$ and $b$ ).

S itself must consist of elements of the given alphabet only, right? As we said the alphabet consists of a and b only, the process of transforming S (with which we start in the example) is already finished when we start. --Abdull 11:52, 4 December 2007 (UTC)[reply]

In this example S is just a symbol, but not a symbol from the alphabet. It's a symbol that is used to recursively generate strings. The actual strings that are generated are the ones that don't contain S; the ones that do contain S aren't done yet. — Carl (CBM · talk) 14:08, 4 December 2007 (UTC)[reply]

Renaming[edit]

This page is useful, but very messy. The main reason, I feel, of its messiness was the continuous attempt to appease both computer scientists and linguists; I have taken the liberty to take what I I felt was the most urgent step to rectify this, namely, renaming it to use well-established terminology. The old name (formal grammar) isn't used by anyone I know. Rp (talk)

I have performed this deletion and move, as requested. I don't do this often, so if there's something that I should have done, or should have done differently, I'd appreciate knowing about it. Accounting4Taste:talk 16:37, 30 April 2008 (UTC)[reply]

No, you undid the move I made. Can I please redo it?? Rp (talk) 16:16, 2 June 2008 (UTC)[reply]

Yeah can some admin somewhere move this back to Grammar (formal language theory) -- the current name, formal grammar, is just kind of bizarre ... linas (talk) 02:43, 1 September 2008 (UTC)[reply]

Added text[edit]

I've just deleted the following text from the article:

===Formal language===

A formal language is an organized set of symbols the essential feature of which is that it can be precisely defined in terms of just the shapes and locations of those symbols. Such a language can be defined, then, without any reference to any meanings of any of its expressions; it can exist before any formal interpretation is assigned to it -- that is, before it has any meaning. First order logic is expressed in some formal language. A formal grammar determines which symbols and sets of symbols are formulas in a formal language.

=== Formal systems ===

A formal system (also called a logical calculus, or a logical system) consists of a formal language together with a deductive apparatus (also called a deductive system). The deductive apparatus may consist of a set of transformation rules (also called inference rules) or a set of axioms, or have both. A formal system is used to derive one expression from one or more other expressions.

=== Formal proofs ===

A formal proof is a sequence of well-formed formulas of a formal language, the last one of which is a theorem of a formal system. The theorem is a syntactic consequence of all the wffs preceding it in the proof. For a wff to qualify as part of a proof, it must be the result of applying a rule of the deductive apparatus of some formal system to the previous wffs in the proof sequence.

=== Formal interpretations ===

An interpretation of a formal system is the assignment of meanings to the symbols, and truth-values to the sentences of a formal system. The study of formal interpretations is called formal semantics. Giving an interpretation is synonymous with constructing a model.

I removed all this because, well, the definition of formal language is wrong, and formal interpretations have darned little to do with grammar; they're about first-order logic, and lambda calculus, and stuff like that. Proofs are about theorem proving -- not grammar .. sure its related because semi-Thue systems and etc. but doesn't seem to belong in this article, which isn't about theorem proving ... etc. linas (talk) 02:27, 1 September 2008 (UTC)[reply]

I agree about the interpretation stuff. That is a bit down the road as far a formal grammar. However, I strongly object to removing the background information on languages, systems, and proofs. I do not understand your belief that these have little to do with formal grammar. Just what exactly are you doing with formal grammar that doesn't involve formal language, systems, and proofs? Pontiff Greg Bard (talk) 20:33, 6 March 2009 (UTC)[reply]

A central area of application for formal grammars is in compiler theory. There is no "formal system" or "formal proof" for the language in that area; the languages consist of computer programs, and it would not make sense to "prove" one computer program using other computer programs as "axioms". Another area of application is in the description of the subrecursive hierarchy, for example to define regular languages. There again, there is generally not any formal proof system associated with a regular language.

On the other hand, in mathematical logic, we study formal proof, but not formal grammars. — Carl (CBM · talk) 14:55, 10 March 2009 (UTC)[reply]

We are obviously having some intradisciplinary issues[edit]

In cases where a term describes a single concept which is studied formally in some fields, and has applications in others, etcetera; it is a good practice to include the various fields; aspects rather than split out into separate articles and delete material from one field or the other.

The "formal language theorists," who ever they are, are deleting the mathematical logic material rather than integrate it. These aren't two different concepts. The things you are deleting are not mutually exclusive to what you are adding. Please integrate your wonderful perspective, without taking over the article with name changes, and deletion of lead paragraph material.

Furthermore, there should at least be an article or section on "formal language theory" rather than a redirect, otherwise it is not a very useful opening link. I support the idea that any "field" to which the topic is associated in the lead paragraph should at least have some article content for the reader.

Be well, Pontiff Greg Bard (talk) 20:24, 6 March 2009 (UTC)[reply]

This article is confusing two closely related, but different notions, namely that of a formally defined grammar and that of a grammar as that term is used in formal language theory. The former is a rather informal notion and applies to fields such as formal logic; the latter is a specific way to formalize the former chosen with the purpose of studying a particular range of problems relevant to some applications (but not e.g. to formal logic). In the formal language article, the confusion is the same but worse.

It is my firm conviction that the informal and formal notions should be discussed separately, possibly in separate articles. Rp (talk) 00:27, 10 March 2009 (UTC)[reply]

So formal grammar is informal in mathematical logic. Say listen Rp, whatever you say. I think one article should be enough. The distinction you cite still isn't very clear at all. What is clear that there are some people who see it one way and others who see it a different way. Its a big back and forth and that isn't the greatest way to do things.

One thing we can do together is for us each to not insert our own view and delete the other's.

I am still dubious on "formal language theory" and "formal language theorists." If I could contact one, that would help. If there was an article describing such a thing, that would also help.

Start here (or with any other textbook on formal language theory). Rp (talk) 07:35, 11 March 2009 (UTC)[reply]

And stop changing basic definitions into flagrant nonsense. This is not a matter of point of view but of subject matter knowledge. Rp (talk) 22:51, 11 March 2009 (UTC)[reply]

There does exist a "mathematical logic" article, and that is where I recommend we point people who want to inquire, until there is more information elsewhere. Be well, Pontiff Greg Bard (talk) 02:15, 10 March 2009 (UTC)[reply]

Be well, Pontiff Greg Bard (talk) 02:15, 10 March 2009 (UTC)[reply]

Who want to inquire what? Sorry, it is not clear to me what you mean. Rp (talk) 07:36, 11 March 2009 (UTC)[reply]

The term "formal grammar" is use extremely rarely in mathematical logic. It is a term from formal language theory and computer science. Mathematical logicians do not, by and large, study arbitrary formal languages; they study a very small number of formal languages related to predicate logic. When presenting this small number of languages, mathematical textbooks extremely rarely (if ever) couch it in the language of formal grammar. Instead they simply present, ad hoc, the recursive definitions that are required. Even Enderton's textbook, which spends a large amount of time on the effective recognizability of first-order formulas, does not use terminology from formal grammar theory.

It is not normally used in formal language theory, either: there, the term is just "grammar". Rp (talk) 07:35, 11 March 2009 (UTC)[reply]

Responding to Rp, could you explain what you mean by a "formally defined grammar" apart from a formal grammar? I don't follow you. — Carl (CBM · talk) 14:40, 10 March 2009 (UTC)[reply]

What I mean is that in formal language theory, grammars are not used to define formalisms, they are themselves a formalism , the subject of study. Rp (talk) 07:35, 11 March 2009 (UTC)[reply]

I think you may need to qualify your use of the word formalism a little. Anything you construct in a formal language can be called a "formalism." To say that a formal grammar does not define a "formalism" doesn't make any sense. (I changed it to intra btw)Pontiff Greg Bard (talk) 08:05, 11 March 2009 (UTC)[reply]

You are not reading carefully enough, and you're introducing nonsense into the article as a result. I did not write that a formal grammar does not define a "formalism". By a "formalism" I mean a formalization used to describe something and to reason about it. What I meant to say is that in formal language theory we don't look at what languages describe, only at their syntax and mechanisms by which syntax can be described. Please familiarize yourself with the basic notions of formal language theory before proceeding, or this article, which is elementary to computer science, will always remain a mess. Please! Rp (talk) 22:51, 11 March 2009 (UTC)[reply]

Re Rp: I didn't mean to emphasize the word "formal". I was saying that the concept of a grammar itself (for example in EBNF form) is not studied nor used by mathematical logicians (apart from extremely rare cases, if any). I don't mind whether they are called "formal grammars", "generative grammars", or just "grammars"; in any case they are not found in mathematical logic textbooks. They are of interest primarily to computer scientists and linguists. — Carl (CBM · talk) 11:39, 11 March 2009 (UTC)[reply]

That is exactly my point. It would help if Gregbard could give some examples of the use of the term "formal grammar" or "grammar" or the study of such grammars from the area of mathematical logic. Mind you, I am well aware that grammar formalisms are used by logicians, but I have only seen applications, not studies on the formalisms themselves. Rp (talk) 22:51, 11 March 2009 (UTC)[reply]

A formal grammar is not a set of strings[edit]

The article currently says,

"In mathematical logic and theoretical computer science, a formal grammar (sometimes simply called a grammar) is set of strings composed from the alphabet of a formal language. "

This is the definition of a formal language, but not a formal grammar. A formal grammar is a certain type of description for how one could form a formal language. Moreover, as several people have explained above, formal grammars are not studied in mathematical logic, they are studied in computer science. Gregbard, rather than reverting you, I want to give you a chance to correct these. I'll fix them later if they're still broken. — Carl (CBM · talk) 11:27, 11 March 2009 (UTC)[reply]

What is said here: http://209.85.229.132/search?q=cache:http://planetmath.org/encyclopedia/GenerableByAFormalGrammar.html is clearer than the article. Is it correct?--Philogo (talk) 23:51, 11 March 2009 (UTC)[reply]

That page says, "A grammar, loosely speaking, is a set of rules that can be applied to words to generate sentences in a language." That is correct. The article here presently says, "In formal language theory, grammars, also called formal grammars or generative grammars, are a formalism used to describe formal languages – i.e. sets of strings over some alphabet." That is also correct. The issue before was that the article here was identifying formal grammars with formal languages, when the two concepts are very different.

I wanted to browse PlanetMath farther, but it seems to be down. — Carl (CBM · talk) 01:51, 12 March 2009 (UTC)[reply]

I liked the planetmath formulation too philogo. Pontiff Greg Bard (talk) 04:24, 12 March 2009 (UTC)[reply]

I have fixed the problem. I overgeneralized. However. it certainly not true that it is not studied under mathematical logic. It may not be a major focus, but you can't really say it is not under mathematical logic at all. Please take a look at the category organization of things. If there is some problem, I suggest we start there. It's was under predicate logic which was too specific because propositional logic has a grammar too.

Much appreciated! The introduction looks quite OK to me now. I still resent the implication that formal grammars are a notion from formal logic, even though it is closely connected to it. Rp (talk) 19:25, 12 March 2009 (UTC)[reply]

I wish people would not take the attitude of its this field NOT THAT FIELD. If is continues then we really are going to have to start making all kinds of articles (logic) for articles that already exist. The logic template should portray a coherent story, with common terminology. That is not too much to ask. I don't think we should have string (computer science) and string (logic) etc. However if this is the way it always is going to be them we will have to do that for MANY of the terms in there. Ridiculous.

Good point ... Syntax (logic) is a case in point, unless it can be given more logic-specific substance. Rp (talk) 19:25, 12 March 2009 (UTC)[reply]

Carl, always feel free to correct me. I have complete respect for you, you know. Be well, Pontiff Greg Bard (talk) 03:16, 12 March 2009 (UTC)[reply]

I do feel that the study of formal grammars is not, in any significant way, a part of mathematical logic. If the categories here claim it is, they need to be fixed. The reason I can say this is that I have looked at many mathematical logic textbooks over the last few months and I have never seen one that discusses formal grammars. If there is one, I would be happy to be corrected. The situation is even worse for grammars than for formal languages – at least a few mathematical logic books mention the concept of an arbitrary formal language, even though the study of general formal languages is not part of mathematical logic.

In the same way, the study of classes of languages such as regular languages and context-free languages is not part of mathematical logic. These topics are covered by computer science textbooks, but not by mathematical logic textbooks. In general, formal language theory is a computer science topic.

As for string (logic), if the meaning is the same as in computer science, then a simple redirect to string (computer science) would be reasonable. But this is not the same situation as with grammars, which are really only studied in computer science and linguistics, not in "logic" nor in mathematical logic. — Carl (CBM · talk) 03:54, 12 March 2009 (UTC)[reply]

It seems to me that if you want to make a formal system, a propositional or predicate calculus you will have to define a formal grammar. All of those articles are listed under mathematical logic.Pontiff Greg Bard (talk) 04:23, 12 March 2009 (UTC)[reply]

Can you point at any mathematical logic textbook that defines a formal grammar (for example in EBNF form) in order to define the syntax of first-order logic? Usually the definition is just given in an ad hoc way, explicitly stating the recursive definitions needed, but is not couched as a formal grammar, nor as a rewriting system. I don't mean a book that happens to use the word "grammar" once in passing, like Franzen's book; I mean something that actually uses a generative grammar in a form like EBNF. — Carl (CBM · talk) 04:39, 12 March 2009 (UTC)[reply]

No I don't usually see them use the term, however that's what it's called when they spell out that if ɸ is a formula then ~ɸ is a formula, if ɸ and Ψ are formulas then (ɸ & Ψ) is a formula, etcetera. I see that type of thing spelled out commonly whenever they designate a propositional calculus. They are usually called formation rules. That term is directed here and it was incorporated into the article, and was subsequently deleted. Pontiff Greg Bard (talk) 05:33, 12 March 2009 (UTC)[reply]

While formation rules for propositional and first order logic are a special case of formal grammars, mathematical logic does not include the study of sets of formation rules in general (nor the study of formal grammars in general). A very few sets of these rules are used to define a few systems such as first order logic but, as I said above, the definitions are usually presented in an ad hoc way.

In order for formal grammars to be part of mathematical logic, mathematical logicians would need to actually study formal grammars as an independent topic of research. This is done in computer science, but not mathematical logic. I think your argument is akin to saying that abstract algebra includes mathematical logic because the principle of mathematical induction is sometimes used to prove results in abstract algebra. — Carl (CBM · talk) 05:47, 12 March 2009 (UTC)[reply]

Moreover, the syntactic rules of formal languages such as defined in mathematical language usually (implicitly) define trees of terms, rather than strings. E.g. you often see a definition of formation rules with a casual remark in the text that brackets may be used to disambiguate expressions where necessary. This is out of the question in formal language theory. So while there is a close relationship the notion of syntax definitions in e.g. logic is not exactly the same as that of a grammar in formal language theory. Rp (talk) 19:25, 12 March 2009 (UTC)[reply]

I don't really agree with the all or nothing view. Formation rules are a term within mathematical logic even if mathematical logicians don't slave over hot formation rules all day. It is a type of formal grammar. Furthermore I am trying to make sure all the terms from the logic template are consistent, and complete. I have split off the article to make a tiny one. I think we would be better served constructing a comprehensive article that covered all the aspects so as to make it possible to be an FA. In the absence of a community desire to do that at least lets make sure the use of these things in logic is covered. Be well, Pontiff Greg Bard (talk) 22:11, 12 March 2009 (UTC)[reply]

Wrong. Let's discuss this, before you further edit the articles. "Formal grammar" is used in mathematical logic to refer to the formation rules. Perhaps non-mathematical metalogic makes the distinction. You seem to be creating a term not used in the literature. In any case, the split article should not be called "formation rules". Perhaps there are more distinctions between formal grammar (computer science), formal grammar (linguistics), and formal grammar (mathematical logic), but that isn't it. — Arthur Rubin (talk) 19:28, 13 March 2009 (UTC)[reply]

Clarity[edit]

Why cannot this article be as clear as http://209.85.229.132/search?q=cache:http://planetmath.org/encyclopedia/GenerableByAFormalGrammar.html which says as below. The explantion is clear, short, involves no technical words unfamiliar to the normal reader, (accept perhaps "mathematical abstraction" and makes the matter simple, not more complicated than it is. It would be nicer still if it defined "language" in passing. I suspect Einstein developed the skill of writing clearly about complex matters from his experience as a patent-writer.

A grammar, loosely speaking, is a set of rules that can be applied to words to generate sentences in a language. For example, with the grammar of the English language, one can form syntactically correct sentences such as “The elephant drove his bicycle to the moon,” regardless whether the sentence is meaningful or not.
The mathematical abstraction of a grammar is known as a formal grammar. Instead of generating sentences from words, a formal grammar generates words from symbols and other words. The following basic ingredients are necessary in a formal grammar:
a collection of symbols called an alphabet, a collection of rules, called rewriting rules, specifying how one can generate new words from existing ones, and a collection of initial words that serve to initialize the generation of new words.

Formation rules[edit]

Let's not split the articles until we decide on the name. In my not-so-humble opinion, formation rules is not used in mathematical logic, even if it were the "correct" term. — Arthur Rubin (talk) 19:30, 13 March 2009 (UTC)[reply]

There are two issues here:

1. The study of general formal grammars is not part of mathematical logic, in the sense that research in this area is carried out with different goals and by different people than research in mathematical logic

2. Nevertheless, formal grammars are used implicitly in defining things such as the set of formulas in first-order logic

I tend to agree that there's no need to create Formation rules merely because of (1). It is perfectly possible to say that mathematical logic uses some special instances of formal grammars, without claiming that formal grammar theory is part of mathematical logic. Certainly mathematical logic uses basic facts of number theory, but number theory is also not part of mathematical logic. — Carl (CBM · talk) 20:44, 13 March 2009 (UTC)[reply]

That is a perspective that respects the academics just fine and has no respect for the reader at all. I don't care where the material in "formation rules" ends up, so long as the topic is covered for people trying to learn. The article on first order logic features a section of that name, and therefore it is fair game to be an article of its own under mathematical logic. The logic template has included the term formal grammar, and no one has complained about it until now. Pontiff Greg Bard (talk) 21:05, 13 March 2009 (UTC)[reply]

Could you explain the argument that, because the article first-order logic has a section on formation rules, there has to be an article that claims that the general study of formation rules is part of mathematical logic? — Carl (CBM · talk) 21:12, 13 March 2009 (UTC)[reply]

No one ever claimed that the "general study of formation rules" is a part mathematical logic. All I am claiming is that it is a term to be understood when learning mathematical logic, there is plenty of evidence that that is the case whether you call it a formal grammar or a formation rule. Really this boils down to you guys not caring if people can learn from the articles, as much as it agrees with your perspective. Who cares about a "general study of formation rules/formal grammar?" That's just black and white thinking.

Let me tell you a little something about analysis. You break things down into their components. The formation rules/formal grammar sure are a part of any formal system, and so this whole argument revolves around your image of your field, rather than helping people understand. Terrible. Find a good place for that stuff. Be well, Pontiff Greg Bard (talk) 21:24, 13 March 2009 (UTC)[reply]

When I look at the beginning of this edit I do see the claim that the general study is part of mathematical logic. I don't think anyone here is arguing that formation rules never appear in mathematical logic. However, I do think few students learn about formation rules in general (for example, the Chomsky hierarchy) when learning about first-order logic, any more than they learn about semigroups in general when learning that the word problem for semigroups is undecidable. — Carl (CBM · talk) 22:25, 13 March 2009 (UTC)[reply]

I don't see why formal grammar is not the place for all of this article. It's not part of mathematical logic, linguistics, or computer science, but is used by all three (the middle one to a lesser degree). It may be formalized in recursion theory, but that formalization is not necessary for analysis. If there is consensus that formal grammar (mathematical logic), formal grammar (linguistics), and formal grammar (computer science) should be split, that would be different. I think CBM may have hit the ~~pontiff~~ nail on the head; the lead needs to be fixed to remove any implication that it is part of mathematical logic. — Arthur Rubin (talk) 22:55, 13 March 2009 (UTC)[reply]

...and yet there is a section for it in the first order logic article (which I did not put there).

You (Carl) certainly are reading in the interpretation since no where do I say (or care) what is a "general study" of mathematics. This is a term you guys are using to make it seem as if there needs to be a special team of experts working on a very specific thing in order to consider it "within" mathematical logic. This appears to me to be quite neurotic. I've seen this topic covered under mathematical logic. Either incorporate that aspect in the article on formal grammar, or split it out into a separate article under mathematical logic --period.

I really just get the sense that neither of you cares very much for helping people understand the subject matter or consistent story in the logic template at all. Terrible. Pontiff Greg Bard (talk) 23:02, 13 March 2009 (UTC)[reply]

Please remove formal grammar from the logic template; it clearly doesn't belong there in any form. It's used in mathematical logic, but I don't see it is central. Your attempts to "clarify" the issue make it incomprehensable to anyone who does use it mathematical logic, as did your metalogic and the redefinition of standard terms you used before that, and before that. Please refrain from creating new terms or redefining existing ones unless you can provide sources that your new defintions are actually used, and would be the primary use. — Arthur Rubin (talk) 23:20, 13 March 2009 (UTC)[reply]

I think there is actually one good reason to keep the two articles separate. While logic works traditionally with strings, there are generalisations that do not. $L_{\infty \omega }$ comes to mind as a relatively important one. I think in-depth discussion of formulas as trees rather than strings, with possibly infinite branching, should not happen in the present article. But it needs to be done somewhere, and Formation rules looks like a good place. That said, I consider it a supplementary article in the sense that everything it says should mainly be covered here, and in articles on the various logics, and formation rules should merely present the same material with slightly different accents. I think there should be a short treatment of formation rules in logic within the present article, briefly explaining some of the background that is usually implicitly understood in logic articles, and a more complete treatment assuming the usual background at formation rules.

An alternative could be an article on the tree generalisation of formal languages, and more specifically about formal grammars that define such tree languages. But such an article, while certainly interesting, would naturally be dominated by XML (see XML#Validity), and the infinitary aspect of applications in logic would still make them stand out. --Hans Adler (talk) 15:42, 22 March 2009 (UTC)[reply]

Interpetations[edit]

Mine: I call: "0,1,2,3" are here regular, context-free, "l-system", "touring" respectively.

While sentence is made from tokens the "0,1,2 languages" add or leave it as it is (terms into that sentence). Turing can also change parts of it.

Therefore there are 2 categories : may add , may add or change.

In this list I presume that the actions like "adding" or "insert" might not be done at all. And "more places" might mean just 1 or zero. I just don't say it in the list to make it shorter :).

0 language adds to sentence.
1 language inserts to sentence.
2 language inserts at more places. He inserts it in parallel, therefore rules should be mutually exclusive.

It is important to know that 0,1 languages make change only at one place at a time. They can be (even 0 when there is some foretelling) made to run in parallel by machine, but it might not work. 2 language have to work, it is given by what it is (just like 2 + 2 is 4, for it is given. It might have been 8, but it would make sense ... I hope You get mine point :) ).

Now, something I just ask, because:

0 language changes only 1 symbol. A -> aB //// A-> Aa is not possible in 0 language because it is not expanded at the end? But it can be understood as A->aA in this case of just one rule (like simple recursion)?
1,2,3 languages can change sequence of symbols? AA -> aBACd
3 can change symbols and terms? aA -> Ab

Correct me please. 84.16.123.194 (talk) 12:42, 10 May 2009 (UTC)[reply]

In the Chomsky hierarchy, a type 2 language is a context sensitive language, not a language generated by L-systems. The answer to your first two questions is no. For the answer to the first, see the discussion in regular grammar; to the second, see the definition of context-free grammars, which says there must be exactly one nonterminal on the left hand side of a rule; to the third, yes you're correct. If this is not clear from the article, it should be. How to improve it? Rp (talk) 16:15, 16 June 2009 (UTC)[reply]

Relation[edit]

Is the relation defined in

Given a grammar $G=(N,\Sigma ,P,S)$ , the binary relation $\Rightarrow _{G}$ (pronounced as "G derives in one step") on strings in $(\Sigma \cup N)^{*}$ is defined by:

$x\Rightarrow _{G}y{\mbox{ iff }}\exists u,v,p,q\in (\Sigma \cup N)^{*}:x=upv\wedge y=uqv\wedge p\rightarrow q\in P$

well founded? Money is tight (talk) 13:09, 4 February 2010 (UTC)[reply]

Syntax == grammar?[edit]

From this article: "[A grammar's] rules describe how to form strings from the language's alphabet that are valid according to the language's syntax."
Then there is a section called Formal definition - The syntax of grammars, but this section does not explain the relation between syntax and grammar.

Therefore, are syntax and grammar the same? Is there a mathematical way to give a relation between these two entities? --Abdull (talk) 14:42, 31 August 2010 (UTC)[reply]

Computer scientists talk about the syntax of a programming language; mathematicians and logicians talk about the grammar of a formal language. The two concepts are more-or-less interchangeable, although there may be some technical differences. Gandalf61 (talk) 15:05, 31 August 2010 (UTC)[reply]

Generally in computer science syntax is the language from the word level up. lexical is the forming of characters into words. But it sometime is equivalent to grammar. Steamerandy (talk) 05:07, 28 January 2015 (UTC)[reply]

Null string on left of the rule?[edit]

The left side of a rule can not be the null string. For one thing, $\Lambda \notin (\Sigma \cup N)^{*}N(\Sigma \cup N)^{*}.$ — Arthur Rubin (talk) 04:45, 8 October 2011 (UTC)[reply]

You can simulate it by adding a new non-terminal α, interspersing αs before and after every character in the string on the right side, and changing the rules, as follows:

$S\rightarrow T\Rightarrow S\rightarrow T'$
$\Lambda \rightarrow T\Rightarrow \alpha \rightarrow T'$

and adding a new rule

$\alpha \rightarrow \Lambda$

where, for example

(abc)'=\alpha a\alpha b\alpha c\alpha

— Arthur Rubin (talk) 04:58, 8 October 2011 (UTC)[reply]

Unrestricted grammar[edit]

It appears that the languages recognized by an unrestricted grammar are the same as those recognized by this description of a formal grammar. The proof can be done as follows. Add a new nonterminal symbol a′ for each terminal a, and a new nonterminal λ. For any string T, define T′ to be obtained from S by replacing each terminal b by b′, and interspersing λ before and after every character. For example, (if we denote nonterminals by greek letters and terminals by roman letters), if

T=a\alpha bc\beta ,

then

T'=\lambda a'\lambda \alpha \lambda b'\lambda c'\lambda \beta \lambda .

If we then replace each rule

T\rightarrow U

in the unrestricted grammar by

T'\rightarrow U'

and add new rules:

S\rightarrow \lambda S\lambda

\lambda \rightarrow \Lambda

(the empty string), and

a'\rightarrow a

for each terminal a;

then the modified grammar is a formal grammar in this sense, but the language recognized is the same.

If we can find a source for this, where does it go? Do we merge formal grammar and unrestricted grammar? — Arthur Rubin (talk) 13:52, 11 October 2011 (UTC)[reply]

The proof is correct, I have never seen this definition of a formal grammar, as mentioned above the name is not used in computer science. But it looks a bit like the context-sensitive grammar rules, without the non-empty requirement and without the context in the right-hand side. So where does this definition comes from? It does not come from Chomskies first paper, ther he talks about rewriting a string into another string, without restrictions.--134.102.206.230 (talk) 12:06, 30 November 2011 (UTC)[reply]

They are the same thing. Merely the fact that two articles were created shows that something went completely wrong. They should be merged. Rp (talk) 12:42, 29 January 2015 (UTC)[reply]

What is more: the title, formal grammar, is also wrong, they are rarely called that. Shall we rename this article to grammar (formal language theory)? That would convey what the title tries to say using a convention that is widely used on Wikipedia. Rp (talk) 09:10, 28 August 2015 (UTC)[reply]

Context-sensitive is missing[edit]

The Typ 1 Chomsky Grammars are not mentioned in the article.--134.102.206.230 (talk) 11:58, 30 November 2011 (UTC)[reply]

Consecutive sentences appear to contradict each other[edit]

The following two sentences appear consecutively:

"Any particular sequence of production rules on the start symbol yields a distinct string in the language. If there are essentially different ways of generating the same single string, the grammar is said to be ambiguous."

But the second sentence seems to say that in some grammars there are essentially different ways of generating the same string, while the first sentence seems to say that that cannot happen (because of the use of the word "distinct").

I hope someone knowledgeable about this subject will resolve this apparent contradiction.50.234.60.130 (talk) 14:24, 23 December 2020 (UTC)[reply]

Well spotted. That first sentence was, unfortunately, completely wrong:

A sequence of production rules doesn't yield a string at all:
you also need to state where to apply those rules;

it only yields a string in the language if no nonterminals are left at the end.
Not every sequence of rules can be turned into a valid sequence of rule applications on a given string: the left hand side of a rule to apply must occur in the string at that point in order for that rule to be applicable.
The statement that different sequences of rule applications yield different strings only makes sense if you restrict your range of grammars and the way you apply rules, e.g. if you assume context-free grammars and always rewrite the leftmost left hand side found in the intermediate string at hand.

I've taken the liberty to strike that sentence and the reference to Turing machines (which didn't make any sense to me either), reword a few things for clarity, and add another example to show non-contextfreeness and ambiguity. Rp (talk) 20:53, 5 January 2021 (UTC)[reply]

Error in Example 3[edit]

The grammar claimed to be equivalent to example 2's, as far as I can tell, is not. Counterexample: "b" can be derived from Grammar 2 as described in the article. It can not be derived from Grammar 3. Rule 2 as the last-applied rule is out as it would produce an undesired S at the end. So we have to use aSa -> b, which can not work as we can only ever have one S at a time, and have no way of producing an a on it's right hand side. A fix would be to substitute rules 3 and 4 with S -> a and S -> b. 45.143.78.99 (talk) 12:01, 20 January 2021 (UTC)[reply]

Done, thanks! I don't know how those extra "a"s ended up in there ... Rp (talk) 22:17, 21 January 2021 (UTC)[reply]

Mention the topic of equivalence of two grammars[edit]

There's the concept of equivalence of two grammars, and the respective article (Equivalence (formal languages)) really should be linked somewhere in here. I wrote the comment in Formal language. Shuween (talk) 17:19, 7 March 2023 (UTC)[reply]

Not all formal grammars are generative[edit]

In the case of parsing expression grammars, it is provably undecidable to determine whether the language of a grammar is nonempty, hence it is grossly incorrect to claim (as does the current version of this article) that

A formal grammar describes how to form strings from an alphabet of a formal language that are valid according to the language's syntax.

A parsing expression grammar allows you to test (in linear time, even) whether a string belongs to the corresponding formal language, but this is very different from saying how to form such strings.

One way to effect the proof of undecidability is that any instance of the Post correspondence problem reduces to a linear size instance of the problem of finding an element of the language of a particular parsing expression grammar. This is in Ford's POPL'04 paper, and is (as he remarks) essentially the same proof as is used to prove it undecidable whether the intersection of two context-free languages is nonempty. 130.243.94.123 (talk) 13:14, 15 January 2024 (UTC)[reply]

The opening as it currently is stated is in line with how it is described in every reliable secondary source I have been able to find. Do you have a source that describes it differently? Epachamo (talk) 18:26, 31 January 2024 (UTC)[reply]

Sketch for new opening[edit]

A formal grammar is a rigorous finitary definition of a formal language, where the language is understood as an in general infinite set of strings over an alphabet. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form.

To be regarded as a grammar, such a definition is usually expected to take the form of rules for how certain units within the strings of the language are made up of smaller units also defined by rules, but it is hard to make this a strict requirement, as some of the more powerful classes of grammar may when applied end up doing arbitrarily complex computations. There is a tension between a grammar being powerful enough to describe a language of interest, and the grammar being simple enough that it permits drawing conclusions about the language it describes. Accordingly, there is in the literature a number of different classes of grammar defined and studied.

Formal language theory, the discipline that studies formal grammars and languages, is a branch of applied mathematics. Its applications are found in theoretical computer science, theoretical linguistics, formal semantics, mathematical logic, and other areas.

A grammar is generative if it supports generating arbitrary strings in its language. A grammar is analytic if it supports testing arbitrary strings for membership in the language. Several important classes of grammar (for example context-free grammars and regular grammars) are both generative and analytic, but this need not be obvious from the definition. Other classes might be only one of these things, in the sense that instances of an undecidable problem reduce to the problem of using a specific grammar the other way.^{[Note 1]}

37.197.59.228 (talk) 14:16, 23 January 2024 (UTC)[reply]

Why does this distinction need to be made? Who needs it? Rp (talk) 11:02, 30 January 2024 (UTC)[reply]

Note the purpose of the introduction from MOS:LEAD (emphasis mine):

"In Wikipedia, the lead section is an introduction to an article and a summary of its most important contents... It gives the basics in a nutshell and cultivates interest in reading on—...It should be written in a clear, accessible style with a neutral point of view.

While it is true that parsing expression grammars are not necessarily generative, generative grammars are still what is most commonly talked about when we say "formal grammar" and they serve as the foundational motivating example (for example, in this paper "grammar" implies "generative grammar." https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3367686/ ). When readers are looking for formal grammars, therefore, they are most likely to be looking for generative grammars. To deemphasize generative grammars in the introduction sacrifices readability and usefulness for a lay reader in favor of technical accuracy. I don't think this tradeoff is appropriate here, since the article already talks about non-generative grammars while still giving generative ones the emphasis they deserve. The suggested revision would make the introduction much less clear and accessible.

That being said, I would support extending the last paragraph to add more information about how parsing grammars are not necessarily generative. Collisteru (talk) 03:35, 31 January 2024 (UTC)[reply]

My point is that currently the article is not neutral point of view, but rather biased for a view that formal grammars have to be generative. (There is that old archived discussion where others have tried to argue a similar point, but seem to have lacked clear examples.) The early theory of formal grammars was based on the generative formalism, so it is not surprising to see textbooks defining formal grammar as precisely that if they don't go further, but the literature has long since moved beyond that; a second example class of analytic but not generative grammars are range concatenation grammars (which are even listed in the "canon" of the Template:Formal languages and grammars).

I would (in the article as a whole) certainly still present the rewriting-based generative formalism before getting into non-generative formalisms — even introduce it less technically, since the idea of working with preliminary strings that contain a mix of actual text (terminal symbols) and placeholders for text still to be inserted (nonterminal symbols) is quite intuitive — but I would also make it clear from the start that this is one formalism, not the only formalism. 37.197.59.228 (talk) 07:20, 31 January 2024 (UTC)[reply]

This recommended opening is less accessible than what we currently have. If it were more accessible then I would vote to change it. Epachamo (talk) 18:20, 31 January 2024 (UTC)[reply]

Agreed, I would also vote for an introduction that emphasizes that generative grammars are only one formalism if it was just as accessible as the one we have now. Collisteru (talk) 19:45, 31 January 2024 (UTC)[reply]

Cite error: There are <ref group=Note> tags on this page, but the references will not show without a {{reflist|group=Note}} template (see the help page).

[Note 1]