Wikipedia talk:Category intersection/Rick's mockup/African-American actors/Archive 1
This is an archive of past discussions on Wikipedia:Category intersection. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 |
Comments
Rick, I don't see how the Broader/finer feature will work:
- The broder categories are not primary categories so they would not have any articles that would likely lead to anything being in the intersection set. A few of the finer categories are also not primary categories.
- Christian actors, Actor-singers, Actor politicians and Actor sportspeople would not exist in your system (if I'm understanding your most recent comments correctly) because these would replaced by intersections. So how would they be linked to this intersection set?
- For the "broader" list I simply listed the current parent categories of the categories that are intersected here while the "finer" list is (some) of the current subcategories that aren't in the intersection set. Is this useful? In some cases I think it might be. For example, if instead of "Actors" this were "Film actors" or "Academy award winning actors", this "parent" list would include a larger superset category. Even as it is, it might be useful for traversal purposes. To get from here (African-American actors) to African-American directors, you might click People by occupation (which would unclick Actors because Actors is directly in People by occupation), recompute which would probably lead to a null intersection but from this intersection the "finer" list would include all subcats of People by occupation, and from there click Directors (which should probably automatically unclick People by occupation) and finally "recompute". This is 4 physical clicks, two "browser local" clicks to select the categories being intersected and two web page fetches. Currently, from Category:African-American actors you can get to Category:African-American/Black film directors with two clicks (both web page fetches) through Category:African Americans. However, you can't currently get to Category:African-American dancers at all (because it has not been populated or created, even though Alvin Ailey is categorized in both category:African Americans and category:American dancers). -- Rick Block (talk) 14:00, 14 August 2006 (UTC)
New broader/finer interface
I've redone the the broader/finer interface. There's one column for each category in the intersection. The categories shown above the "intersection" category are the ones that category is in. The ones shown below are categories in that category (I've restricted both to categories that I think would not be intersections). The columns might be of different sizes, but all categories in the current intersection would be shown in the same row. All categories above this row are "higher level", all categories below this row are "lower level" categories. When you pick a different member in any row, the currently selected category is NOT unselected automatically.
From this display, you can immediately get to a broad range of other intersections, and easily get to lots of others. For example, to get to any other African-American <occupation> intersection you click "People by occupation", unclick "Actors", and hit "Recompute intersection". The intersection that's displayed now has a row with People by occupation and American people and People of African descent. It likely doesn't have any articles or "subcategories", but the leftmost column now shows all occupations (subcats of People by occupation). So you click a different entry in this column (and recompute) and there you are. If you want combinations of occupations (African-American actor politicians), from this display you click both Actors and Politicians. -- Rick Block (talk) 02:51, 20 August 2006 (UTC)
- I'm going to attempt to push you on these ideas by asking some questions because I think this has possiblilities, but I don't yet see it clearly.
- What happens when an intersection uses 4 primary categories? What happens when it uses 19 (Imagine that someone clicks every primary category listed under George W. Bush except one). I don't think we want to limit the number of categories that someone can use for an intersection just so there is room to format it nicely.
- It's a table, so it expands and contracts as needed. 19 might involve some horizontal scrolling.
- How does someone understand that to get to other African-American occupation categories you have to first Navigate to the intersection with Category:People by occupation and then do a second selection? This seems extremely non-obvious. I didn't understand it at first and I used to be a software designer, programmer and database consultant.
- There would presumably be a help page of some sort about this, that would walk through some examples. Now that you see it, does it seem usable?
- There are 188 subcategories of people by occupation. What will it look like when you get to the interim intersection page? Will there be 185 primary categories listed in a very long column? What does it look like when people pick several categories populated with scores of primary categories, like intersecting professions, nationalities, ethnicities, year of birth, etc...
- I was thinking about this a bit this morning. I think there are two (complementary) answers. The first is that the column length has to be limited, like the number of articles or categories in an existing category is limited. For display purposes, it's probably best if the entire "intersection selection table" fits on one screen. I can't think of any categories that are in hundreds of categories, so I don't think the size above the "current intersection row" is likely to be unmanageable. For categories that include hundreds of categories, the size below the category intersection row is an issue. The software needs to limit the list, to something like 20 or 30 subcats, and include previous/next links (per column) to "scroll" up and down within the set of subcats. This means there has to be a URL format to specify the starting location of each subcat list (similar to the
&from=
parameter for existing category listings). This should be a&
parameter, perhaps&fromsubcat=<intersection cat>::<subcat>
. The second complementary solution might be to restructure some of these basic "index" categories with subcat groupings. For example, Category:People by occupation might be subcatted by class of occupation, with the actual occupations listed only under the occupation classes. Perhaps Category:Scientists, Category:Entertainers, Category:Politicians, Category:Businesspeople and a few more would be sufficient. This could not be done lightly, since it creates intermediate "primary" categories that should be fully populated (every microbiologist should be in category:microbiologists, category:biologists, and category:scientists), and it creates a larger distance between two arbitrary occupations. It seems like a 3-deep structure should be sufficient for nearly all occupations.
- I was thinking about this a bit this morning. I think there are two (complementary) answers. The first is that the column length has to be limited, like the number of articles or categories in an existing category is limited. For display purposes, it's probably best if the entire "intersection selection table" fits on one screen. I can't think of any categories that are in hundreds of categories, so I don't think the size above the "current intersection row" is likely to be unmanageable. For categories that include hundreds of categories, the size below the category intersection row is an issue. The software needs to limit the list, to something like 20 or 30 subcats, and include previous/next links (per column) to "scroll" up and down within the set of subcats. This means there has to be a URL format to specify the starting location of each subcat list (similar to the
- How is this easier or preferable to my method where someone would simply click on the parent category Category:African-Americans by profession and then select one of the 188 categories that are listed there as subcategories? What is improved by making it work totally differently from what people are already used to doing? How is the abstract notion of an intersection set better than a named category in plain English?
- I think this gets back to whether most intersections people are interested in will "preexist" or not. With your approach, if a user gets to an intersection that does not preexist, I think they'll be pretty much stuck. These intersections will not be connected to any rich index categorization scheme – no parent categories except the "primaries" that are being intersected, no way (except by typing in a URL) to get anywhere else in the categorization structure except these primaries. The two different kinds of intersections (preexist or not) have a very different "connectedness" to the rest of the category structure. The more likely a user is to get to a preexisting intersection, the more problematic I think this becomes in the cases where the intersection doesn't already exist (i.e. if you rarely run into this, you won't know how to deal with it). Making all intersections the same forces people to deal with the interface, and learn how to use it.
- Following through your own example, if I'm at Category:African-American actors (which is behind the scenes an intersection), and it is marked as a subcat of Category:African-Americans by profession (which I notice does not exist), and I click on this parent category, what do I see? Only the preexisting intersections that are listed as subcats? How do I get to Category:African-American architects (which doesn't exist today)? It seems to me Category:African-Americans by profession should be generated (much in the same way as an intersection category), with "subcategories" consisting of all possible intersections of Category:African-American people and subcats of Category:People by occupation (whether these are preexisting intersecctions or not). With your proposal, I don't see how these "cross product" index categories can work – it seems like their subcats have to be limited to preexisting intersections. With my scheme, Category:African-Americans by profession would be Intersection:American people::People of African descent::People by occupation, not a category, it might preexist or not (the only difference being whether there's a description for it), and a new intersection of Category:American people, Category:People of African descent and any subcat of Category:People by occupation could be accessed, whether this new intersection preexists or not.
- I think the real problem is how do you take an intersection and add yet another unrelated primary category into the intersection mix. In my method, an editor could add a subcategoy Category:African-American actors by state, Category:African-American actors by series, etc... Since this is controlled by user edits, only subcategories that make sense and are useful would be listed. How do you do add another primary category in your system?
- Let's think about how this would work in your system first. You add Category:African-American actors by state as a subcat of Category:African-American actors. You click the new cat, Category:African-American actors by state and add a description. It has no members, and its members can't be automatically generated, so I guess you have to create 50 new intersection categories (one for each state) and add them as subcats to your new index cat (is this right?). And Category:African-American actors by series is worse, since the list of television series is nowhere near as constrained as the list of states.
- OK, now my turn. You add a "see also" link from Category:African-American actors to Intersection:People of African descent::Category:Actors::Category:American people by state (maybe you pipe this link so it shows up as "See also: African-American actors by state"). You're done. Maybe you go to the new intersection and add a description or maybe you don't. What you see when you get there is an intersection selection table that lets you intersect People of African descent, Category:Actors, and any subcat of Category:American people by state, whether these intersections preexist or not. You don't need to create them. You don't need to add them as subcats of the "index" cat you just created. And, you can select to go to African-American stage actors from California, or any other mix of any of the subcats of the intersected cats. In the by-series case, you add a link to Intersection:American people::People of African descent::Actors::Television series and you don't need to add a new intersection subcat every time someone adds a new show to Category:Television series.
- --Samuel Wantman 07:07, 20 August 2006 (UTC)
- I think the bottom line is that keeping these concepts orthogonal is extremely powerful. I believe I've mentioned this before, but keeping the namespaces separate completely resolves the "what happens if someone adds
[[Intersection:x::y::z]]
(or[[:Intersection:x::y::z]]
) to an article or category" issue. It's simply a link to the intersection. I suppose adding[[Category:x]]
to an intersection should add the intersection to the category (where you can specify a more readable name with a pipe). I don't think this would be very commonly used, although it's effectively an automatic "see also" link. To get an intersection to show up this way in another intersection, you'd have to add it to all the primary categories comprising the desired intersection. -- Rick Block (talk) 16:39, 20 August 2006 (UTC)
- I think the bottom line is that keeping these concepts orthogonal is extremely powerful. I believe I've mentioned this before, but keeping the namespaces separate completely resolves the "what happens if someone adds
- What happens when an intersection uses 4 primary categories? What happens when it uses 19 (Imagine that someone clicks every primary category listed under George W. Bush except one). I don't think we want to limit the number of categories that someone can use for an intersection just so there is room to format it nicely.
I'm going to break out of the list because the :s and #s are getting me dizzy.
I'm not getting your bottom line. My scheme provides for the "what happens if someone adds [[Intersection:x::y::z]]
(or [[:Intersection:x::y::z]]
) to an article or category". As you say, it's simply a link to the intersection. I don't know what issue you are talking about. In my scheme you may have the option of naming the intersection as a category, or seeing the category page if it has already been named, but the intersection page would remain a seperate name-space. If it is connected to a category it is transcluded into the category when you are looking at the category or pick categories from an article. Otherwise, you look at the intersection page (and only admins get to edit it).
My general response to you is that we are discussing two different things and imagining two very different classes of users. I am imagining a mature structure where categories have been created and I'm primarily thinking about making this easy for the casual users of Wikipedia, who I think are the large majority. You are imagining the system at day one, and seeing the categories empty and uncreated and are imagining the technologically savvy user. I'll give you an example of what I mean... In #5 the question I posed was how does the casual user start from an intersection that has 3 categories and NAVIGATE to an intersection that includes a fourth category. I'm assuming that many useful intersections have been named and linked already. (BTW, I don't like the term pre-existing. All the intersections exist from day one, they just don't have names and places in the categorization structure. Can we use the terms "named" and "un-named"?) Since the naming and linking have been done by the experienced category editors a wonderful structure has already evolved and the casual user is presented with a list of very useful subcategories. I also think the casual user will help create this structure because they will create many useful intersections by checking off the categories in the articles that they browse to. I suspect that oodles of intersections will be named very quickly. Perhaps the first category page created automatically gets categorized as Category:Orphan intersection categories and there will be editors that will patrol the orphan category and figure out where they should be linked. Since the busywork of populating subcategories and depopulating primary categories will go away there will be a good number of editors looking for something to do to boost their edit counts. I do think that most of the useful intersections will get named and categorized very quickly. I'm recalling when categories first started. I thought, "this will never work, it will take much too much effort to convert all those lists into useful categories." I was amazed how quickly it happened. This process seems much easier than that one. In addition, I think most people will be very happy with finding intersections from articles. For the most part, people will be looking for things close to where they are.
I do think there is something quite nice about what you are trying to design. I would like to add something like it to the proposal. I don't think what I have described is contrary to what you are talking about. I see it as another feature, a way to navigate through intersection space. I would like people to have an easy way to navigate through intersection space. If you read what I wrote about in the proposal, the category/intersection matchup only happens when you look at a category or select categories from an article. You sould still be able to search for intersection pages, type the url, or add links directly to intersections. In these cases the category would NOT be displayed automatically (though there might be an automatically generated link). (I'm sorry, it's getting late, I'm getting sleepy, and I'm repeating myself.)
I am not sure what the best mix will be. It certainly would be nice to be able to navigate quickly and easily to any intersection. If the design for that interface for navigating around intersection space were easy enough to use and they could be renamed, then there may be no need to call them categories. On the other hand, we can call them intersection categories and it would look and feel very much like what I've designed. So as I see it, the question is not your system versus mine, but can your broader/finer idea evolve into something that is simple enough to use.
I don't think it is there yet. The horizontal scrolling might be unweildy, the breaking of large categories (nationalities, professions) into smaller subgroups seems forced and an arbitrary constraint, displaying only part of the long subcategories would make it impossible (or at least very difficult) to combine "Albanians" with "Veterinarians" if they are on opposite ends of long categories.
It would be possible to do the broader/finer thing by limiting the degrees of freedom so that you can only vary one intersection at a time. If this were so, it could be displayed similarly to how subcategories are displayed. This is the implied hierarchy of the intersection space, and you would move through it one change at a time. For example, 'Child actors' is a subdirectory of 'Actors'. 'African-American child actors' is a child of all three primary categories. It would be possible to add these to both your system and mine. We could have a section of sub-intersections for each parent category displayed. It would be possible to get to all the same places, but it would take a few more interim page-loads. For example:
- Intersections using subcategories of Actors
- Intersection:Character actors::American people::People of African descent
- Intersection:Child actors::American people::People of African descent
- Intersection:Fictional actors::American people::People of African descent
- Intersection:Actors by medium::American people::People of African descent
- Intersection:Porn stars::American people::People of African descent
- Intersection:Actresses who have appeared veiled::American people::People of African descent
- etc...
- Intersections using subcategories of American people
.
.
.
- Intersections using subctaegories of People of African descent
.
.
.
- Parent Intersections (using parents of Actors)
- Intersection:Acting::American people::People of African descent
- Intersection:Entertainers::American people::People of African descent
- Intersection:People by occupation::American people::People of African descent
- Parent Intersections (using parents ofAmerican people)
.
.
.
- Parent Intersections (using parents of People of African descent)
.
.
.
These could all be displayed in the same format that subcategories are now displayed. Certainly this could result in some very long pages, but at least it is just long in one dimension.
I'd be happier if these had names that made sense in English, and you could pipe them as you say. If they had names, it would look like this:
- Intersections using subcategories of Actors
- African-American character actors
- African-American child actors
- African-American fictional actors
- African-American actors by medium
- African-American porn stars
- African-American actresses who have appeared veiled
- etc...
- Intersections using subcategories of American people
.
.
.
- Intersections using subctaegories of People of African descent
.
.
.
- Parent Intersections (using parents of Actors)
- Parent Intersections (using parents ofAmerican people)
.
.
.
- Parent Intersections (using parents of People of African descent)
.
.
.
I think this is much easier to comprehend, and takes up much less space. All of this can be generated automatically as long as the intersections are named, either your way or mine. It would also be possible to combine all the sub-intersections and parents and display them the same way as current categories:
- Subcategory intersections
- African-American character actors
- African-American child actors
- African-American fictional actors
- African-American actors by medium
- African-American porn stars
- African-American actresses who have appeared veiled
- etc...
- Intersection categories
I can see setting it up like this. All the listings could be generated automatically using the categorization structure. If the intersections are named, it could use the name, if they are not named they would display with the long intersection mark-up. See also links could be added to get to intersection pages that use additional traits.
This is essentially your scheme, formatted like my scheme. We can discuss the formatting later along with whether there are checkboxes or listings. First I want to figure out all the functionality. I still have some unanswered questions with your scheme. How do you navigate from a primary category to an intersection? Are intersections put into categories? Is there only "see also" links? Is the hierarchy: {Primary categories, 2 category intersections, 3 category intersections, etc...} The thing I'm missing in the ability to make editorial decisions and say, "these intersections are useful so we put them into the categorization structure" and "These can be ignored because they are meaningless, or empty or just have one article in them."
So for your scheme:
- The intersections are separate from the categories,
- There may be a way to give them English names.
- It can list the sub-intersections and intersection categories automatically (like I've done above or by using a checkbox interface).
- "See also" links to connect up additional useful categories,
- Perhaps intersections get put into regular categories, (but only if they have 2 categories intersecting?)
Here's the disadvantages I see:
- It looks like a category, but it isn't. People will try to put articles into categories that have the names of intersections.
- The hierarchies are not part of an edited structure so every intersection is just as important as any other even though one might be very significant and another might be meaningless. There will be a huge number of meaningless intersections that people will be able to navigate through easily. It seems like it will be a maze that might be difficult to get out of.
For my scheme:
- The intersections look and behave like categories,
- There is a way to give them English names.
- Intersections are added to the categorization scheme one at a time, but there might be a good way to find intersections with a checkbox interface or some other system.
- Same "See also" links.
- Articles miscategorized get fixed.
- Only meaningful intersections make it into the hierarchy.
So back to #5. You said:
- you would add a "see also" link from Category:African-American actors to Intersection:People of African descent::Category:Actors::Category:American people by state (maybe you pipe this link so it shows up as "See also: African-American actors by state"). You're done. Maybe you go to the new intersection and add a description or maybe you don't. What you see when you get there is an intersection selection table that lets you intersect People of African descent, Category:Actors, and any subcat of Category:American people by state...
In my scheme, all I would have to do is make Category:African-American actors by state a subcategory of African-American actors, and the only thing that would appear in that category would be subcategories. It is possible that my scheme could have a system of finding children who haven't been named either with check-boxes or the way I illustrated above. This feature can be added to what I'm proposing as easily as it can be added to what you are proposing. (sorry for rambling on and on....) --Samuel Wantman 09:33, 21 August 2006 (UTC)
Rick's August 22 reply
- I think I confused you by starting the paragraph above with The bottom line is. I meant only that one sentence, i.e. that keeping these concepts orthogonal is very powerful. I added the other stuff in that paragraph (about the links) on a read-through of what I'd written (bad editing on my part).
- About the links (more bad editing), the issue is what happens if someone "adds" an article or a category to an intersection. With what you're proposing I think this will happen all the time (and I think you agree), and so for articles you've suggested the software automatically "do the right thing" (recategorize into the primary categories). On the other hand, adding categories to intersections is fine (in fact, necessary for what you're suggesting) so the software simply allows this. I see this as a significant issue, both in terms of the user mental model of what's happening as well as in the software. If a user adds [[category:African-American actors]] to an article, hits save, and this "category" does not show up in the category list, I think the user will think it's a bug. By mostly hiding intersections behind what look like categories I think you're ensuring not only that this will happen but that it will confuse users (because in your scheme intersections look so much like regular categories that most users won't know the difference – by doing this, you're preventing users from developing a mental model that matches what is really going on). Even if users do catch on that these interesection categories are different (and shouldn't be added to articles), they then have to realize that as far as the categorization structure is concerned these categories aren't different! Borrowing some of your words, I didn't understand this at first and have been a software designer, programmer, and system engineer (for Bell Labs and its offshoots) for 25 years.
- Contrast the above with what I'm suggesting. Categories are categories, completely unchanged from current operation. Intersections are new and different. No confusion. More complex mental model, but the inherent difference is completely visible which allows users to learn the difference.
- On the named/unnamed vs. preexist/not naming issue, I think this reflects your model of how this works. I believe you're thinking of intersections as intrinsically the same as categories, so "on the fly" intersections are categories that don't have a name yet. With what I'm proposing, intersections don't need a "category" name since they're fundamentally different from categories. The difference I mean to be highlighting is really whether there's an existing database record or not, mirroring the difference between a redlinked or "blue linked" article. I could try to use named/unnamed when talking about your scheme, but I'd rather use something different when talking about my scheme (since named/unnamed doesn't apply).
- You're thinking about casual users and a mature structure where categories (and I think you mean intersections) have been created (named?), and you meant the #5 question to be a navigation question. If we assume the structure is mature, I don't think there has to be a huge navigational difference. In your scheme, if a user is at Category:African-American actors to get to Category:African-American actors by state they click a "subcategory" link and from there, to get to a specific state, they click another "subcategory" link to get to, for example, Category:African-American actors from California. In my scheme, (assuming equal maturity in the structure) there's a "see also" link from Intersection:African-American actors to Intersection:African-American actors by state and from there they'd click a (previously manually created) link to the intersection for the state of interest, probably by state name. Assuming a fully populated category/intersection structure, I don't see this as much different.
- However, the scheme I'm suggesting works almost as well whether the category/intersection structure is fully populated or not, and I don't think yours does. In your scheme, navigation to an intersection (from the category structure) is possible only if the intersection has been both named and added as a member of one or more "parent" categories/intersections. In my scheme, the navigation can be "on the fly" by picking categories to intersect using the selection table displayed at the bottom of an intersection. From an article, we both start with picking a set of categories to intersect. In your scheme, there is no parallel operation available once the user gets to "category" space, and navigation within category space only functions between "categories" (which are either primary categories, or named intersections). I don't think we want, or need, to "name" most intersections. In this example, I wouldn't expect there to be named intersections for each state like Category:African-American actors from California, and if there aren't, then there's not much point in Category:African-American actors by state existing either. I think this means in your scheme these intersections would not be accessible from "above" in the category structure. A user might get to Category:African-American actors from Georgia from the Laurence Fishbourne article, but it won't be named, won't be "linked in" to the category structure (except to the primary categories), won't have an obvious parent category that exists, and, I'd claim, we don't want it to. If we don't name these intersections we don't have to manage them (at all). We don't have to CFD the ones that don't make sense. We don't have to worry about their names. I'm not thinking about the system at day one, seeing the categories empty and uncreated – I'm thinking about a fully functional system that works without the overhead of manually creating 100s of thousands (if not millions) of named intersections.
- Jumping ahead a bit, your summary is the following (I'll switch to italics):
So for your scheme:
- The intersections are separate from the categories,
- There may be a way to give them English names.
- No. My proposal is intersections exist only in "intersection space". Their name consists of the names of the intersected categories separated with some punctuation characters. Links can have arbitrary (English) names through piping, but these are not intersection "names".
- It can list the sub-intersections and intersection categories automatically (like I've done above or by using a checkbox interface).
- I'm not so sure I buy anything other than a checkbox sort of interface. It seems like there are likely to be too many possibilities to enumerate them all, and I don't see a reasonable way for an automated system to limit the choices since we don't know what "dimension" the user is interested in.
- "See also" links to connect up additional useful categories,
- Perhaps intersections get put into regular categories, (but only if they have 2 categories intersecting?)
- I think if you add [[category:whatever]] to an intersection, the intersection is added to the "regular" category. The category tag can include a piped name (which for articles is used only for sort order, not appearance). I think it would be reasonable to use a piped intersection name in a category listing. Note that I think adding intersections to categories would be rare, since (in my proposal) all categories are primary categories.
Here's the disadvantages I see:
- It looks like a category, but it isn't. People will try to put articles into categories that have the names of intersections.
- Now wait, this is a disadvantage I see of your scheme. In my scheme, intersections DO NOT look like categories. The page title says Intersection:blah blah blah, not Category: blah blah blah. The page listing looks superficially like a category listing, but it's obviously, noticeably, different. I don't think people will try to put articles into "categories" that have the same name as intersections. They're two entirely different concepts. I think people will understand them as different concepts.
- The hierarchies are not part of an edited structure so every intersection is just as important as any other even though one might be very significant and another might be meaningless. There will be a huge number of meaningless intersections that people will be able to navigate through easily. It seems like it will be a maze that might be difficult to get out of.
- I'd rephrase this as the hierarchies don't have to be part of an edited structure. Any intersection will be easy to navigate to, whether some previous editor thought it was "useful" or not. This is the primary difference between Google and the earlier "hierarchically organized" web indices. Google makes no attempt to "preselect" what you might find interesting.
For my scheme:
- The intersections look and behave like categories,
- This only applies to named intersections, and only within "category space". I'd call this a disadvantage and word it as "A named intersection look like a category, but it isn't. People will try to put articles into these categories and will be confused when they can't.
- There is a way to give them English names.
- More specifically, there is a way to give them a category name. Rather than repeat myself, let's just say I don't see this as a particular advantage.
- Intersections are added to the categorization scheme one at a time, but there might be a good way to find intersections with a checkbox interface or some other system.
- I see this as a huge disadvantage. Consider the recent split of Serbia-Montenegro into two countries. With my scheme, as soon as the countries exist in the appropriate parent categories, all intersections involving the new countries are immediately accessible through the "normal" intersection selection mechanisms. With your scheme, no intersection involving these countries appears in any "x by country" index category (and there are a LOT of these, even considering only the ones involving one other category) until the intersection is named and manually included in the appropriate parent category. Yes, a sufficiently knowledgeable user could get to these intersections but presenting intersections as categories and primarily using the category navigation mechanisms to get to them makes the maintenance of the category structure a never-ending, and not especially small, task. I personally dislike all the "x by y" index categories we have and would very much like to find a way to get rid of them.
- Same "See also" links.
- Articles miscategorized get fixed.
- I'm not sure what this refers to.
- Only meaningful intersections make it into the hierarchy.
- Which means the arguments at CFD about what is and is not "meaningful" continue ad nauseum. My scheme provides a mechanism to drastically simplify "the hierarchy" (reduces it to "primary" categories, with the rest created "on the fly" reflecting the particular user's interests).
- I get the sense that you are quite attached to your intersection=category idea. The more I think about this, the more I think we should keep categories and intersections separate. Would it be useful for me to put together a more complete user interface mockup? Perhaps multiple screens, how you'd navigate, etc.? Is there more talking we can do? Maybe split into two pages, one where you throw rocks at my scheme which I try to address and one where I throw rocks at your scheme which you try to address? Is it time to bring in more people? It doesn't seem like long essays are helping that much. -- Rick Block (talk) 02:52, 22 August 2006 (UTC)
Sam's August 22 response
Well these essays are helping me stay up too late. Yes it certainly would help very much if I saw your proposal worked out some more. Some of my resistance may be from not seeing the entire picture. I hope you take what I've written as encouragement to develop the ideas. When you went on your wikibreak and I started putting together the proposal it forced me to work out the details of how everything would work. It is now very hard for me to figure out if your version is acceptable because I'm not sure exactly what it is. So take the proposal, and copy it to a new page and edit away.
As we discuss, I've been modifying my mock-ups and they are moving in your direction. You've been addressing my concerns (like piping in English) and working out navigation. I think we will meet. I don't think we are really that different. The only real difference that I see is the transclusion of intersections. I'd be happy to part with that if everything else comes together. I realize, looking over the history of all this, that my very first message about this on your talk page sounded more like what you are now proposing than I am. I don't think we are at odds. I'm going to put some effort into your way of doing things...
I think a good deal of your resistance to my version (and vice versa) may be in the way things are presented. For example, you say "If a user adds [[category:African-American actors]] to an article, hits save, and this "category" does not show up in the category list, I think the user will think it's a bug." But in my imagining the same events the user adds [[category:African-American actors]] to an article, hits save, and Category:Actors, Category:American people and Category:People of African descent appears, the user will think "Wow, the software is smart!"
The presentation can integrate or divide the category and intersection namespaces as much or as little as needed to help people understand what is going on. I think the presentation of your system can probably be made clearer and I think the presentation of my system can probably be made clearer. The intersection used in a category could be linked where it is displayed or at the bottom of category pages just as templates that are used are now linked when editing. Some people will go to them, hopefully everyone who understands what an intersection is. I don't think I have introduced any new paradigms. There is already plenty of transclusion.
So lets get some commonality out of this. Things we both want:
- An easy to use interface for navigating down through category space, make a smooth transition to intersection space and then navigate through intersection space.
- A way to reference an intersection (this we agree on, it is just [[Intersection:xxx::yyy::zzz]] or something similar. The links could be piped (just like links to articles are) so that they appear as Zee-Wyian exes.
So here are some things that would get me to like your system more:
- A system of having an English headings for intersections. For example at the top of the page it could say:
- Intersection::Actors::American people::People of African descent
-
- "African-American actors"
The top line would be the page name just as you have it in your mockup, but the second line could be a name that can be user-edited (I won't call it a category!) Here's an idea, what if the second name is added to the url when it is called by the piped link? That way, whatever the piping is, it will appear the same way when it is viewed as an intersection, and nothing else has to be done to make it appear. It might be named differently from different links "Poets from England" from one place, and "English poets" from another.
- A clearer way to get around intersection space. I wouldn't mind seeing something like this:
This reads like the lists of categories for actors and visually has higher and lower categories, emphasizes where we are, seperates the groups, and can grow to whatever size is needed to display everything. I kinda like this, I'm going to put it in my mockup.
So now I am trying to understand how these two spaces will interact in your system. I can imagine that people will decide to put intersections into categories and if they did, I think they should end up in section of their own like this:
- Intersections
There are 5 intersections shown below (more may be shown on subsequent pages).
etc...
I'd be happy if they used the page tag name instead of the URL name. It would take up much less space and be more readable than having:
- Actors::American people::Female people::People of African descent
- Actors::American people::Male people::People of African descent
So my question to you is, would you like to control the proliferation of categories for intersections? I gather you would. If so, how? I suspect that very quickly there will be lots of categories for intersection pages such as Category:African-American actors by U.S. state which would be a category filled with 50 intersections.
So I'm thinking this way of presenting your proposal would look pretty similar to what I was proposing and I could probably grow to like this way of doing it. It is possible to design this with the constraint that intersections cannot be categorized. All the fooian fooer categories would disappear and the only way to navigate down the category structure to Category:American writers would be to manually add a "see also" link to Intersection:People by nationality::Writers piped and tagged as "Writers by nationality" or Intersection:American people::People by occupation piped and tagged as "Americans by occupation" and all you would see is an empty intersection with the ability to pick new categories.
So this removes my problems with naming and navigating, but I still have an unanswered question, which perhaps is not answerable by the two of us without more input:
- Does it help people to make the line between intersection and category so clear? I think to programmer types and wikiholics the answer is probably "yes". But to the average Jane or Joe, it might be daunting and confusing. This is certainly a distinction that is never made in printed media. How many poets are going to think of English poetry as the intersection of English people and poets and because English poetry is an intersection it is no longer a category. This is contrary to the spoken language understanding of what a category is.
- Will people find it easy and natural to find the English poets by checking off the box for English people and then clicking "recompute intersection"? It seems easier to just click on "English poets".
So, lets ask others. -- Samuel Wantman 11:03, 22 August 2006 (UTC)
Rick's August 23 reply
This is getting entirely too unwieldy. I've extracted the questions and some other comments from your latest (above) and addressed them here. Your words in italics.
It certainly would help very much if I saw your proposal worked out some more.
- I'm not sure when I might get to this, but I'll work on it at some point. I'm thinking about writing a simulator to show what my idea of an arbitrary intersection page might look like (based on actual data fetched from wikipedia). I've written several scripts that do various things – I think it would not be terribly different to write a simulator that would take an arbitrary intersection URI and produce a simulated page. The simulated pages are kind of a drag to make by hand, and if there are going to be more than one or two of them I think it might be easier just to write a simulator.
[I'd like] A system of having an English heading for intersections.
- If you're willing to live with a manual method, each intersection page has an "edit" link (similar to categories). You can add anything you want to the "body" of an intersection page, so just add the "English" name as the top line in the body. (I don't like inferring this name from a piped link name.)
A clearer way to get around intersection space.
- I'm not crazy about the horizontal presentation (vertical seems more "natural" to me). Either way, there will only be a limited number of subcats displayed. Maybe more examples might help clarify which way is better.
I can imagine that people will decide to put intersections into categories and if they did, I think they should end up in section of their own like this:
- I agree people might, but I don't know how common this will be. The presentation you show seems reasonable. In my August 22 reply, I suggested the "piped" name be used for presentation in addition to sort order (even though this is NOT how piped category additions currently work for articles or categories).
So my question to you is, would you like to control the proliferation of categories for intersections? If so, how?
- Yes, I'd like to control the proliferation of "categories for intersections", and I'd like to replace essentially all of the "fooian fooers", and "fooians by foo" categories with intersections. If we can create a traversal mechanism in intersection space that's easy enough to use, then I don't think anyone will miss these categories (and I think using intersections instead of categories would eliminate a large maintenance burden). In the discussion above about Category:African-American actors by state, I propose this category not exist (as a category) but rather as Intersection:People of African descent::Actors::American people by state. This isn't an intersection intended to show articles, but an intersection intended to let the user traverse to a different intersection. If we can make this work (easily and intuitively), I think we can use the same technique for all the "fooians by foo" categories. What I'm envisioning is a mechanism allowing a user to get to any intersection, using other intersections as "indices".
All the fooian fooer categories would disappear and the only way to navigate down the category structure to Category:American writers would be to manually add a "see also" link to Intersection:People by nationality::Writers piped and tagged as "Writers by nationality" or Intersection:American people::People by occupation piped and tagged as "Americans by occupation" and all you would see is an empty intersection with the ability to pick new categories.
- I think there are a couple of ways this could work. The primary categories are category:American people and category:Writers (right?). If primary categories are the only ones that show up on articles, how does a user ever get to "Writers by nationality" (whether it's a category or an intersection)? Certainly not from an article. If we assume "Writers by nationality" is an intersection (see mockup), then let's say it's a precreated (we're talking about my scheme, right?) intersection, and this intersection is added to category:Writers and category:People by nationality perhaps with a piped name of "Writers by nationality". This would let a user traversing from either category:Writers or category:People by nationality to get to this intersection. There almost certainly aren't any articles in this intersection, so what you see when you get there is a page that pretty much only has an "intersection selection table" (which may be vertically or horizontally organized). The "current" intersection is category:Writers and Category:People by nationality, so the immediately available "recompute" intersections would include any type of writer (subcats of category:Writers) and any nationality (subcats of Category:People by nationality). This intersection has been "precreated" (since it was in the approriate parent categories), so it could have manually created links to various other intersections (perhaps in nice compact tables). Even if no one has provided these, the basic "intersection selection table" will allow a user to get to any "writers by <country>" intersection (as well as any "poets by <country>", or "screenwriters by <country>", or ... ). There are 248 subcats of Category:People by nationality and 27 subcats of Category:Writers (although many of these are currently "writers by X" index categories, which I think should become intersections), which leads to to 6696 "fooian fooers" intersections (and 275 "x by y" index intersections). I'm thinking all of these would be traversable from this intersection selection table. Creating an intuitive intersection mechanism that applies to categories in addition to articles makes all of these able to be automatically generated (requiring none of them to be "precreated"). Two links, one from category:Writers and one from Category:People by nationality are all that's needed to get any of these (nearly 7000) intersections.
- My guess is that if we make this traversal easy enough, no one will want to "create" these intersections. They'll simply be automatically available, to anyone who wants to use them.
Does it help people to make the line between intersection and category so clear? I think to programmer types and wikiholics the answer is probably "yes". But to the average Jane or Joe, it might be daunting and confusing. This is certainly a distinction that is never made in printed media. How many poets are going to think of English poetry as the intersection of English people and poets and because English poetry is an intersection it is no longer a category. This is contrary to the spoken language understanding of what a category is.
- Most definitely yes. Categories are primary characteristics. The intersection feature provides a way for anyone to search by any combination of these primary characteristics. I'd expect "English language poets", defined as the intersection of "poets" and "English language writers" to be exactly how poets would think of it. The categories are the primary characteristics. Intersections of these aren't categories. Forcing intersections onto the category structure is an unnatural artifice, required by the current software. "African-American physicists" is not a category. It's the intersection of "African-Amercans" (which is itself an intersection) with "Physicists". The categories involved in this are "African", "American", and "Physicist".
Will people find it easy and natural to find the English poets by checking off the box for English people and then clicking "recompute intersection"? It seems easier to just click on "English poets".
- Easier in this one example, but if we can train people how to to navigate "intersection space" they can get to so much more than anyone can possibly guide them to that I think it's worth the tradeoff. The trick is making it easy enough, and intuitive enough, for most people to understand.
-- Rick Block (talk) 04:16, 23 August 2006 (UTC)
Sam's August 24th reply
I'm really trying to wrap my mind around what you've been talking about and here's what I'm wondering:
It seems that while most categories do not have over 200 subcategories, there are some that are much larger. The largest one I know of is Category:Albums by artist. Which would be replacef by Intersection:Albums::Musical groups. This is a really huge intersection, there are many thousand subcategories. If we are going to create an interface for navigating through intersection space, it should be able to handle a category such as this one well.
- I agree that very large categories need to be addressed, and that this is one of the largest (category:living people might be larger). This particular one I'm not so sure what do with. It doesn't quite seem like an "index intersection". I think it may not even be an intersection at all, but a fully populated, high level primary category – the by-individual-artist subcats are generally categorized into this cat and one or more of the "genre" subcats of category:albums. If we fully populate all levels of the category:albums hierarchy I think maybe this category simply goes away. The "by genre" subcats of category:albums could be intersections of category:albums by artist (or, if we're recatting everything anyway, category:albums) with subcats of Category:Music genres, but looking at Category:Music genres it doesn't seem that there's an existing category for each genre (what's up with that?) and albums don't seem to be dual categorized in the musical genre hierarchy.
- Whatever we do with this specific category, your question is really about how a large category can be used in an index intersection. I don't know if you noticed, but I created (by hand) another mockup, see Wikipedia:Category intersection/Rick's mockup/Writers::People by nationality. More on this below.
While I can appreciate the elegance of being able to pick categories for a new intersections I think it breaks down for these large intersections. If two categories with large numbers of subcategories intersect how do you pick the beginning of the alphabet from one and the end from another. Is it possible to flip to the next page of one without flipping to the next page of the other?
- Yes. I've been imagining there's an equivalent of the prev/next links for each "column" (intersected category) in the intersection selection table that has more than, say, 25 subcats. My latest mockup shows one at the bottom of the nationalities column. I've added an alphabetical index (like our old friend CategoryTOC) as well, sort of just to see how it looks (and I don't like it alot). Maybe a horizontal presentation might work out better (I'll try one). An intersection between two large categories would have two sets of prev/next links (and potentially two sets of alphabetical TOCs), one at the bottom of each column.
There is a way to do this that fits with the current scheme of only presenting the first 200 hits from a database search. I mentioned something like this above, and while I too, do not find it to be very elegant, it has some advantages. We could just generate lists of the subintersections based on the subcategories of each intersection category. These subintersections are part of the intersection set. While this limits the possibility of getting subintersections of multiple subcategories, you can navigate down the hierarcy to them.
I'm also searching for a better name for this and wondering using "Index" as a name. This would look something like this:
- I've been calling these "index intersections". I don't think there any different from "regular" intersections, they just generally don't have any articles or categories that are in all the intersected categories. They amount to "traversal" nodes in intersection space. I really, really want these to be generally automatically generated, more on this below.
Index:Albums::Musical Groups
- Albums by artist
This index combines the following categories:
- Indexes of Albums
There are 2 indexes shown below (more may be shown on subsequent pages).
A
- Albums in the 33⅓ Series by artist
- Albums with hidden tracks by artist
- Indexes of Musical groups
There are 198 indexes shown below (more may be shown on subsequent pages).
- A
- Aconite Thrill albums
- Acoustic Alchemy albums
- Action Action albums
- Acumen Nation albums
- Adair albums
- Adam & the Ants albums
- Adam Ant albums
etc...
- Pages in index "Albums by artist"
There are 0 pages in this section of this index
Notes:
- The top section lists the categories used in the intersection and there would be one listing of indexes for each higher level of intersection. The higher levels are the intersections with one category removed. In this case there are two because it is an intersection of two categories. If there were seven categories being intersected there would be seven sections. For example in the African-American actors example there would be three sections "Indexes of African-Americans", "Indexes of American Actors", and "Indexes of Actors of African descent". You'd be able to navigate all the way up to single category indexes such as "Indexes of Actors", but no higher. To get to those higher level indexes, you would jump to the categories and find a linked index somewhere else. This would limit the extent of easy navigation, but it will also keep most novice users out of meaningless intersection space.
- I think I understand what you mean, but I think this makes the "selection" table generally huge (and not easily navigable or "intuitive"). More detailed mockups might help.
- Once again, I'd like each intersection page to have a name so that when they appear here they would use that name. Piping alone would not do this, because this page is automatically generated. Of course if an intersection has not yet been named, it would appear using the intersection name.
- Is allowing users to add titles by editing the "intersection" (or "index") page not sufficient?
- Indexes would need a TOC.
- Yes. I think indexes need multiple TOCs.
- Text could be added to indexes, but other than a provision for naming them, there would be nothing special about editing them.
- Absolutely!
- The only work needed to create these indexes would be naming them, and creating a link to them when appropriate. See also links to other indexes could be added to this index. This index would probably be linked to, from its parent categories. This could be done with a "see also" link, or by categorization. If we categorize indexes, we should probably have a rule that says, "Don't put an index in a category if the parent index is already in the category".
- I think there are potentially so many of these that I'd prefer if they did not have to generally be named. What is this name used for? If it's a title on the page, why is letting a user add this as "decorative" text (by editing the page) not sufficient? Alternatively, if it's for reference purposes from sone other page, why is allowing a piped name for the link (from the point of reference) not sufficient?
- The top section lists the categories used in the intersection and there would be one listing of indexes for each higher level of intersection. The higher levels are the intersections with one category removed. In this case there are two because it is an intersection of two categories. If there were seven categories being intersected there would be seven sections. For example in the African-American actors example there would be three sections "Indexes of African-Americans", "Indexes of American Actors", and "Indexes of Actors of African descent". You'd be able to navigate all the way up to single category indexes such as "Indexes of Actors", but no higher. To get to those higher level indexes, you would jump to the categories and find a linked index somewhere else. This would limit the extent of easy navigation, but it will also keep most novice users out of meaningless intersection space.
I am not adverse to your category picker, I just don't as yet see how it will work. I think this way of presenting things would work, and it would look familiar. We'd still want the category picker for articles, and the button would say something like "create an index using the selected categories".
I don't understand why you are against naming indexes (intersections).
- My issue with this is twofold. First, I'm not sure what these names are used for, (per above) is it for a title on the page or for reference purposes? Using names for these that does not match the URL or the wikilink syntax bothers me. Second, I'm imagining that there might lots and lots and lots (100s of thousands) of these. I don't want us to have to manage these names at CFD or any other place. If the names are some sort of concatenation of the primary category names, I'm perfectly happy. If the names are "attributes" that are attached somehow and have to be managed at CFD (or by only admins) then I'm really not happy. If we don't name these, no one has to go to the trouble of creating a name, no one has to define standards for the names, no one has to object to someone naming the intersection of Category:Republican Party (United States) and Category:American politicians "retarded cretins bent on world destruction" – there just seem to be a lot of issues that disappear of we simply dispense with these names. -- Rick Block (talk) 04:32, 25 August 2006 (UTC)
I'm willing to drop the category transclusion, and just suggest creating the bots to convert to the new system. I'd still want to suggest that editors give indexes (intersections) a name when they first find one un-named, but it would not have any other implications, other than the display I've mocked up above, it would have the same restrictions to editing that page moves do. The mark-up for links could be Index:Albums::Musical groups. Off-hand, I don't see a problem with also having links using the rename such as Index:Albums by artist. If there are no double colons, the software could look for a rename match. The only problem with doing this is that the links will break if there is a rename, but that doesn't seem that terrible. After all, the links will break with the longer names if one of the primary categories gets renamed. --Samuel Wantman 10:15, 24 August 2006 (UTC)
- I'm not sure you understood my last mock-up above. I'm calling ALL intersection pages "index". And they'd all generate links in English to the other index pages. The pages would all be automatically generated. Without the ability to name pages instead of getting "Republican party politicians (U.S.)" you'd get "Republican party (United States)::American politicians". Some of these intersections might get pretty long and hard to understand. I just think it would be much friendlier to have these names in English. So in my mock-up above, all the listings are links to index pages. I'm trying to push the design towards something that will have a commonly understood meaning. The names would be edited as part of the index page, but it needs to be tied to the index page to be able to be used in other indexes automatically.
- I'm going to work on re-writing the proposal in a more general way, not specifically tied to a specific solution. I'd like it if we could both connect as many mock-ups to it to illustrate he various ways it could be done. I've tried to structure our identified alternative below. Can you think of any other options?
- Name of new namespace:
- Intersection (Rick's preference and Sam's 1st version)
- Index (Sam's preference 2nd version)
- Markup
- Double colon :: (Rick and Sam)
- Double pipe ||
- Colon plus colon :+:
- pipe plus pip |+|
- Common English names for pages of the new namespace:
- Name associated with attached category (Sam 1)
- Named stored with page so that it can be displayed on other pages in the new namespace and also shown as a secondary page heading (Sam 2)
- Named only as a comment on the page (Rick)
- Navigation options
- Transcluded to categories, category structure used for navigation (Sam 1)
- Namespace navigation part of page display for new namespace (Rick and Sam 2)
- Automatically generated using common English names in lists (Sam 2)
- Parent and children categories of primary categories displayed with checkboxes (Rick)
- Vertical arrangement
- Horizontal arrangement
- Categorization of new namespace
- Not needed because pages are transcluded (Sam 1)
- Allowed (Rick) (Sam2)
- Not allowed
- Editing restrictions of new namespace
- Admins only (Sam 1)
- Same as page-moves (Sam 2)
- All editors (Rick)
- Category management
- Integrated into the software (Sam 1)
- Add-on bots (Rick and Sam 2)
Let's spell out the options, then ask for feedback. -- Samuel Wantman 06:44, 25 August 2006 (UTC)