User:Trilobite/MEP script

From Wikipedia, the free encyclopedia

How about this as a starting point. I can add in things like a link to an official website if there is one. The names in the categories are also best put in as they appear and then I can change them manually so that they follow the "surname, firstname" format, because of the way naming conventions vary. There will need to be a male/female flag of some kind to determine whether the output of the script should have "he" or "she", but you've probably got this sorted already. Of course, it could just as easily be done manually. I think the delegations don't deserve wikilinks, but the committees do. I don't know how much rewriting of the script there needs to be to get this output, but if there's something that can't be done I'm sure we can work around it somehow.

Here's a sample output for countries divided into EP regions:

'''NAME''' (born [[DATE]] [[YEAR]] in [[PLACE]]) is a [[COUNTRY|NATIONALITY]] politician and [[Member of the European Parliament]] for the REGION of COUNTRY. He is a member of the [[PARTY]], part of the [[EUROPARTY]].

He sits on the [[COMMITTEE1]] and is a subsitute for the [[COMMITTEE2]]. He is also a member of the DELEGATION1 and a substitute for DELEGATION2.

<you could slot in the career paragraph here maybe, as it's easily turned into proper sentences>

==External links==
* [http://wwwdb.europarl.eu.int/ep6/owa/whos_mep.data?ilg=EN&iucd=##### European Parliament biography]
* [http://wwwdb.europarl.eu.int/ep6-dif/#####_##-##-####.PDF Declaration of financial interests] ([[Portable Document Format|PDF]])

{{MEP-stub}}
[[Category:YEAR births|NAME]][[Category:Members of the European Parliament from COUNTRY|NAME]]

Here's one for countries that are all one constituency:

'''NAME''' (born [[DATE]] [[YEAR]] in [[PLACE]]) is a [[COUNTRY|NATIONALITY]] politician and [[Member of the European Parliament]] for the PARTY, part of the EUROPARTY.

He sits on the [[COMMITTEE1]] and is a subsitute for the [[COMMITTEE2]]. He is also a member of the DELEGATION1 and a substitute for DELEGATION2.

<you could slot in the career paragraph here maybe, as it's easily turned into proper sentences>

==External links==
* [http://wwwdb.europarl.eu.int/ep6/owa/whos_mep.data?ilg=EN&iucd=##### European Parliament biography]
* [http://wwwdb.europarl.eu.int/ep6-dif/#####_##-##-####.PDF Declaration of financial interests] ([[Portable Document Format|PDF]])

{{MEP-stub}}
[[Category:YEAR births|NAME]][[Category:Members of the European Parliament from COUNTRY|NAME]]

I've done an example of the first kind at Natalie Griesbeck.

The example of Natalie Griesbeck is a very nice one. Yes I think it should be done this way.
What I currently don't know is where to get the constituency information for a MEP, so if you have a source for this information where I can retrieve it automatically in some way, that would be very welcome. To turn the biographical information from the EP web site into nice sentences like "Griesbeck has a master's degree in ... it probably too much for a script but depending on how much programming is feasible a mention of "degree" could be truned into "<lastname> has ..." and turn ";" into "and" and so on, but the result would have to be reviewed and likely corrected. Adding [brackets] to linkable Articles would be possible, I have seen some script which would check a local mediawiki database for possible article links, but I have not set a bookmark as I've come across it some weeks ago.
Anyway preparing the info as far as possible should be helpful already and might reduce the work which would have to be done to make a nicely worded biography. All the rest should be quite doable without problem, just needs some programming.
Okay, that's excellent if most of this can be done. Don't worry about the biographical information, it would be a complicated job to turn it all into proper sentences that made sense using the script. What I had in mind was just for that bit of text to be dumped straight in from the EP page and then I can sort it out manually. Would that be possible? The constituency thing I can see will be a problem as that information is not on the individual pages. However, not all countries use it anyway, and I can add it in myself from the lists we have on Wikipedia. To make it easier for me to add it would it be possible to use the formula I suggested (the top one) and just leave the "region" part to be filled in manually which I can do easily. This would mean producing slightly different wording for the countries which use that system (which are Belgium, France, Ireland, Italy, Poland, UK). Germany I can't quite work out, because some of the entries in List of Members of the European Parliament 2004-2009 for German MEPs have regions and some don't. I think it would be best to use the second wording for Germany, as the regions can always be added in later. If any of these things are complicated it's fine to leave them out, like the links to the pdf's declaring their interests are not that important, but I just thought if it was possible for the script to extract the link targets from the page without too much trouble that would be useful. I think we have got it mostly worked out then, if I have explained myself well enough. It would be nice to see an example produced by the script before you run it for every single person. When you run it how does it come out? Would it be possible to make a text file with a couple of line breaks between each entry? This will come to about 1MB in all I think. Then you could paste it in a draft space somewhere, broken into several pieces if necessary. Once the script has produced its output most of the work is done, it will just be a case of copy and paste and some minor changes here and there. How does all this sound? — Trilobite (Talk) 06:08, 10 Apr 2005 (UTC)
We can do it that way. I've just had one idea which somehow didn't jump at me before, which is the link to the Country of the MEP. I think it's actually more interesting to link to the national politics page of the Country than just to the country because even if it's true that European Politics is not like national poltics, in many cases, the MEP might have had a former life in national politics and to some extend, knowledge of the way of poltics in the member state should also help to understand the overall situation better, so e.g. for Austrian MEPs (I'm originally from Austria) I would see it possible to use a link like "Austrian politician".
"Austrian politician" also refers me to the story of national poltics and not so much to EU politics, but it depends on the other hand on the MEP.
There are also some MEPs which cannot be put into the categories of national politicians because they aren't so much national politicians. E.g. Daniel Cohn-Bendit does not really have been a national politician (at least not so much in Germany, in the last EP elections he was running in as the lead candidate of the German Greens while in the elections before, he was running in france, he is best described as Green Politician, which is the current polticial category of his page. Reinhold Messner is not even categorized as poltician, it is merely said that he held political office as a MEP in the last term. Also Karin Resetarits does not seem to to be viewed as a politician in Austria, the list who she has been elected on has been created by Hans-Peter Martin who left the SPÖ in the last EP term and did not join any other party, he founded his own party last year for the election. I think, uniformly true would be only "MEP from [COUNTRY]", but indeed many MEPs have been political figures in their Member state before they became a MEP, so I think we can leave the e.g. French politician by default and people can always change it to fit individual MEPs where neccessary.
That went far away from the topic and I do not have any strict feeling about it, I just wanted to mention that the best descripton does not always have to be like this.
About the constituencies: Thanks for showing me to the page with the constituency info, I think I can make use of the information on this page and maybe integrate it. Indeed many countries did not have regional constituencies in the last elections, e.g. in Germany the parties could choose wether they want regional constituencies or not. Only the CDU and CSU (both PPE, federally they are united, but the Bavaria is the area of the CSU) did choose to use them, so that's why many German MEPs have it and many do not.
It would be interesting to know where the info for this page has been gathered but (maybe from the elections2004 web site) but since it is witten in Mediawiki's table format in the page source, I should be able to parse this information and add it to generated wikisource of the MEP.
I'll try to get all the information which is usefule and I have into the wiki source. I don't know if the link to the declaration of financial interests is so prominent that it's important to have it directly in the external links section since it's linked from the EP biography page anyway, I should be able to generate the link if you like to have it.
Where the EP biography page contains one or more links to web sites of the MEP, I'll of course generate a links to them.
I don't know yet a safe way of knowing the gender of a MEP except for those MEPs where a salutation like "Sra.", "Herr" or so is mentioned in the biography page of the EP. So I can only make use "He" or "She" where I have such salutation for now, For the others, I'll use the last name for now instead. If you know a more ways to find out the gender, I'm happy to integrate them.
I'll upload the output of the script as individial web pages where you each MEP is a indidviual file in a directory of his country, possibly even in a subdirectories of his party and groups, and I can concatenate this then in any way you like and make some bigger web pages, e.g. with all MEPs of a country, sorted either by name, party or group. I can make it so that there are web pages where you can cut and paste the source from and text files, and you can use whatever is more convinient.--NoSoftwarePatents 12:17, 10 Apr 2005 (UTC)
I think then we are not far off being able to generate the pages. Don't worry too much about the details - I can change from he to she just by looking at their EP page. Also the constituency thing can easily be added in by hand so there is no need to spend too much time programming it to be taken from the big list, particularly if there is confusion about the German MEPs (though I think I understand the situation there now). Personally I think I would find it easier to work from a text file where they are all listed one after the other in plain wikitext (in any order, I don't have a preference), but do whatever is most straightforward. I think that is all for now - I am looking forward to seeing the results! — Trilobite (Talk) 12:55, 10 Apr 2005 (UTC)
Ok, I generated first a useable source now and I've tuned it to make it produce excatly what you see on Catherine Trautmann now (no manual editing to get the whole page). Yes that is tuned specifically for her so far only, but it has e.g. a generic [University of xyz] linking, so some of the links will work for many other MEPs too.
If you would like to have some other Articles linked by default too, just tell me. In this state, the wikisource for the other French Socialist MEPs should be of nearly similar quality. I'll upload a text file or web page with all French Socialists soon. Look at the generated article (you can update it, e.g. if you like, you can turn the item-list into a worded biography, I won't touch it anymore) and tell me if you want something changed in the script for the next MEPs. --NoSoftwarePatents 17:16, 10 Apr 2005 (UTC)
This is great, you have got it. You don't need to worry about links in the career section as I will be refactoring these into paragraphs anyway. It's looks like it's all ready now. — Trilobite (Talk) 18:57, 10 Apr 2005 (UTC)
Thanks, I've uploaded the current wiki texts of the French MEPs here: http://raidbox.ffii.org/~bkaindl/MEPs/French/ It's not completely finished, I should make sure that all the EP group links are generated correcty (for now only PES and EPP-ED should work) and more could be done regarding the linking to the national party articles where needed. If you have a some mappings for me (like I've done with the Socialist Party (France) for the first MEP which I should integrate, I can add them. Same for constitutiency links. I think that's a good start, I'll go for dinner now.
Italy, Spain, Germany now added below: http://raidbox.ffii.org/~bkaindl/MEPs/
Fixed a bug and uploaded some more to the same URL
I've uploaded wikisources for MEPs from all countries now, these are 526 MEPs. The MEPs with accent marks in their names are missing at the moment, but it will be the next goal to add them as well.
Thanks for all your work. I am working on putting them up now. — Trilobite (Talk) 04:10, 12 Apr 2005 (UTC)

I've just noticed a bug in yesterday's version: The information on the parliamentary group was missing and the national party information was duplicated at this place. I've fixed this now and added links of the PDFs with the Declaration of financial interests. For 51 MEPs, no declaration is online, so I have created a Template for the text which should be shown instea. It is used like this:

It also adds the Category tag which you see also at the moment at the bottom of this article. I generate the link for this template for the MEPs where there is declaration available. I've also removed the "M.", "Mme." and so on from the name of the MEP at the beginning of the article.

Further Updates[edit]

I've done another upload and created another template for creating a link to the declaration of financial interests:

It's not neccesary to use it but it allows for more flexibilty and for some regarding these links: We could (whould we need to, maybe the URL might change some day in the future) change the URL to the PDF files by changing one central place and have all URLs where it's used updated and it allows for a compact way to create the language link, he "fr" in the source is translucded by the Termplate(s) behind it to "in [[French language|French]]; " and the PDF notice is also defined once for all in as part of the template.

If a template like this shows a red "Template:ISO 639-1/it" string, this is an indication that the template for providing the link "[[Italian language|Italian]]" has to be created:

You only have to click on the red link, click edit there and create the template only using this string:

[[Italian language|Italian]]

After saving it, this becomes part of the above link. (You only have to do a reload of the page to test the result.)

Tests/already created templates:

The list of ISO 639-1 codes to use is in the article ISO-639, in the column with the two-letter (Aplha-2) codes for ISO-639-1.

The list of Templates for this is linked from Template:ISO 639-1. It also demo's to what the templates expand. It's a good place for creating all templates for the 20 EU languages and these templates can be used everywhere in the wikipedia where you want to refer to a language:

Looking at this, I think a template called by {{Language/fr}} could be nicer for common use - we can rename these internals of the declarations template this way possibly, for a start I have tried to use the most specific Template name possible and "ISO 639-1" is really very specific, so I can't think of any conflicting use of the same template names for a different purpose.

I've done a new upload which has most of the time information of the biography moved at the start of the line. I have not catched all cases, but most, I'll work on the rest now.

It also tries to separate the biography (where Education, Career and Decorations where all under in the Section Career) into three sections: Education, Career and Decorations. Of course the respective section titles only appear when I have found an indication that there is now information for such section coming, e.g. of course not every MEP has decorations.

Fixed some bugs with the new feature now, but it seems you were not hit by it or you corrected it yourself like you corrected the other bugs which I didn't see (missing chairmanships of Committees). I also fixed the Chair/Vicechair issue where the script produced wrong output and the committee was completely missing now. Have a look at the output which it produces now. I tried to match on your edits, althogh it's not 100% done yet) --NoSoftwarePatents 07:00, 14 Apr 2005 (UTC)
The server were I've put the files is offline now, so I have uploaded the files here http://ffii.org/~bkaindl/MEPs/ updated the MEP stub template to point it the new URL.