Welcome to the Cheesiest Blog on the Web

Any comments or questions regarding the content on this blog can be addressed to timothyjsharpe@yahoo.com

Tuesday, October 11, 2011

Reanalyzing Gen 5's Uber Criteria – Problems with Smogon's Approach


First of all, this is a very specialized post regarding a very specific topic – the online competitive Pokemon metagame. Those who are not familiar with competitive Pokemon may not be familiar with the jargon used in this article, or care much about the topic. However, for those of you like myself who play competitive Pokemon, you may find this article and my thoughts interesting.

Anyway, so Smogon's recent Standard bans of Excadrill and Thundurus has gotten me thinking about Smogon's current approach to building the ideal metagame.

The “why” of Competitive rules

Most of you know me know that I have always been a Smogon supporter. Some may argue that Smogon tiers and rules change the game so that it is played “not as intended”, but ultimately it is the player's choice how to play the game. The majority of online gamers recognize potential abuses that exist within Pokemon, and therefore see the reasoning behind extra rules designed to enhance competitive play. The ultimate goal, of course, is to create a well-balanced metagame which ideally leads to one that is fun and interesting to play.

It is important to define to define what “well-balanced” means. To understand this, one must understand the logistics of competitive players and environments. Generally, the difference between competitive players and casual players is that competitive players tend to find their “fun” in winning, not the gameplay itself. It is these players who typically wish to test their skills in a competitive environment. For casual players, no competitive rules need exist at all. However, for competitive players, they will strive to create strategies that will generate the greatest chance to win. For that reason, a competitive metagame will always centralize to the best strategies, because in general, players will choose the strategies that gives them the best chance to win. Because of this, competitive metagames have a very real threat of becoming over-centralized.

To illustrate using analogy, let's take the game “Rock Paper Scissors”. For those who do not know, it is a simple game in which both players select one of those three objects and reveal them at the same time; Rock beats scissors, Paper beats rock, Scissors beats Paper. Let's assume a new rule is added to the game... a player could choose “Gun”, which would defeat all three. Naturally, any competitive player will always choose Gun, since it provides them the best chance to win. Seeing this, the creators of “Rock Paper Scissors” adds a fifth option, the “Bullet proof vest”, which beats Gun, but loses against the other three. Now, the metagame will centralize around the broken agent, the cause of the game imbalance (in this case, Gun), and its counter (in this case, the Vest). This new game deceptively has “five options”, but really has only three: Gun, Vest, or R/P/S. In this case, the differences between R/P/S are irrelevant, as their only use is to counter the Vest. And we have effectively taken a game where each option has equal value to a game that only 60% of its options are legitimate. This simple example shows how metagmes will naturally centralize around a broken agent, and its counter(s).

By banning Mewtwo, other Pokemon become viable.
Therefore, to be “well-balanced”, competitive rules are created with the intention of creating a game with the greatest number of viable strategies, or greatest variety. The uber list of Pokemon is to reach this end. Each Pokemon on the uber list outclasses, or makes irrelevant, several other Pokemon. For example, why use Dragonite, Salamence, or Altaria when one could simply use Rayquaza, who can more or less do everything those Pokemon can do better? Or why use any wall when opponents commonly use Deoxys-Attack, who can effectively break almost every wall in the game? By banning an uber, one hopes that several other Pokemon become competitively viable, or the ability to be played with the primary intention being to win.

The other goal of competitive rules is that they should usually be simple, easy to understand, and try to be true to the spirit of the game as much as possible. For example, one could say, “Mewtwo wouldn't be a problem if it didn't have its awesome movepool, so why not ban each really good move, but only on Mewtwo?” While this may allow Mewtwo to be played in the metagame, having three pages of rules like “Mewtwo can be used, but can't have Move X, Y, Z, or Item A, B, or C …, etc.” would be very confusing and in the long run, turn away players due to the fact that it ruins the 'spirit' of Pokemon, and complicates the game too much.

Smogon's classic approaches

Smogon creates the most popular set of competitive rules.
Now that I have explained the goal of competitive metagames, let's take a look at how Smogon currently approaches creating competitive Pokemon rules. The first of which are the common clauses (most notable Sleep Clause), which have been around for a very long time. I won't go much into Sleep clause, but note that without it, Sleep teams would be a broken agent, and the metagame would centralize around Sleep abuse teams, and sleep counter teams. Obviously, Sleep clause may eliminate these two team archetypes, but it is worth it, since several other viable strategies emerge that could not compete with Sleep abuse.

As most of you know, Smogon uses a tiering system to build its competitive metagame. Basically, Pokemon are separated into tiers, and players may choose to play in a particular tier, in which all Pokemon of that tier and Pokemon in a lower tier are legal. Unlike many other tiering systems, Smogon uses usage statistics (taken from the most commonly-used Pokemon simulator at the time... in Gen 5, this is Pokemon Online) to derive most tiers. Hence, their tiers are named accordingly. Overused contains the most used Pokemon, Underused contains Pokemon who do not reach a certain usage, then Rarelyused, finally Neverused (there is also an Unevolved tier for all Pokemon that are unevolved with little to no reason for it to see competitive play over its evolution). The exception to this system, however, is its Uber tier, which is not based from usage.

The Uber tier is not intended to be its own competitive metagame, but rather, a list of banned Pokemon from play in Overused (commonly called “Standard”, the most commonly played tier). Since broken agents are not always evident in usage statistics (and the fact that other broken agents tend to hide others), the Uber tier does not base from usage. Instead, it is an opinion-based tier, with a complex testing system in place and carefully worded criteria about what should be considered Uber. There are two considerations here, however. One is the obvious, determining what broken agents are, and two, determining what broken agents are not. Banning Pokemon that could exist with very little consequence in standard is a negative thing, since by doing so you are reducing the amount of viable options. What makes this tricky is a broken agent's influence on other Pokemon in the metagame. For example, if Kyogre were legal, Swift Swim Kingdra would have a very high usage. Does that mean Kingdra is Uber? No, since it would not be a problem if Kyogre was banned. But this illustrates the problem... sometimes a Pokemon is banned for being a suspect, when in fact, it was another agent alltogether that was the true problem.

Garchomp was found to be Uber in Gen 4.
To compensate for this, Smogon determines Ubers by carefully examining each suspect, quite simply, a Pokemon they suspect of being a broken agent. Smogon then goes through a number of testing metagames on Pokemon simulators to test how each suspect reacts to the metagame alone, and in combination with, other suspects. Each of these testing cycles can take months, which is why Smogon rules are in a constant state of flux and decisions are made very slowly. At the end of it all, players ranking highly on the Pokemon simulator ladder vote on whether a suspect is Uber, or not Uber. Its a good system by design. Public opinion is often skewered to personal experience. But with vast and detailed data available by usage statistics and the vast experience and skill-level of the voters, the decisions made by this braintrust typically are very good.

Smogon's definition of Uber has always been that an Uber, a broken agent, must be a Pokemon. This is for simplicity (see the Mewtwo example earlier). This is within the spirit of Pokemon, since each Pokemon species seems to be its own entity. Its base stats, movepool, typing, is all innate qualities of itself, so it would make no sense if one made rules allowing a Pokemon but disallowing certain options normally allowed to it. This system has worked very well for Smogon for Gen 1 through 4. The only notable exception I can think of on the top of my head is the ban of “Soul Dew”, which allowed Latias to exist in Overused somewhat-abridged in Gen 4 (but one could always consider the Soul Dew rule as a clause).

What went wrong with Generation 5

Speed Boost was previously balanced by being on poor offensive Pokemon.
Taking a look at Smogon's current bans of Generation 5, it off the bat seems to be erratic, and ignoring the fundamental principles highlighted above. Uber is typically a tier reserved for mainly “top-tier legends” (the Pokemon with the highest base stats), but several non-”top tier legends” have already been banned, including Blaziken, Thundurus, Garchomp, and Excadrill. In addition, Smogon has already made decisions against their old philosophy of simplicity. Moody was banned despite it being an ability (not a Pokemon), and a complex rule exists stating one cannot use Swift Swim-ability Pokemon with Drizzle-ability Pokemon. This creates a dual-standard especially when you consider Excradrill, since the sole reason it appears to have been banned was due to its ability, Sand Rush, which is merely the Sandstorm version of Swift Swim. So why is Swift Swim Pokemon not considered uber while Excradrill is (or why is a rule not simply in place disallowing Sand Rush with Sand Stream)?

The main problem for Smogon, as I see it, is not their system. The system they have in place is excellent, and by all accounts, should not vary in effectiveness based on the metagame it is evaluating. Instead, the problem is in their out-dated definition of Uber. There was one huge change in Generation 5 that requires us to re-evaluate how we define an Uber, and that change is Dream World. Dream World is a feature in which Pokemon can receive new abilities, abilities they did not have before. The reason this is a problem is a number of Pokemon received abilities that drastically altered its identity.

Time for another history lesson, though. Pokemon Abilities was a feature introduced in Generation 3. Basically, these are passive traits of a Pokemon designed to make each species more unique, beyond stats, movepool, and typing. Most of these abilities were minor perks, but a few were competitively significant, and played a huge role in the identity of the Pokemon (for example, Wonder Guard is crucial to the identity of Shedinja. It directly influences what it can do, and what it is useful for). For the most part, in Generation 3 and 4, powerful and unique abilities were balanced by being on Pokemon that would be pretty useless otherwise, so the ability and the Species' identity is one in the same.

In Generation 5, however, these once-exclusive abilities can now be found on many Pokemon that by all accounts did not need them, causing a number of problems. The most notable of these abilities are Speed Boost, given to Blaziken, and Drizzle/Drought given to Politoed and Ninetales, respectively. Speed Boost was once balanced by the fact that it was an “identity ability” of a very select few Pokemon, namely Ninjask and Yanmega, who were Pokemon that lacked offensive firepower. Blaziken however, especially with the improved movepool given to him by Generation 5 and great competitive typing, is an offensive powerhouse whose one check previously had been Speed. In the case of Drought and Drizzle, these abilities were uniquely given to Ubers previously, so we never really had to consider their competitive implications. Once again, “Drizzle” and “Kyogre” were one in the same thing. Now that Drizzle is also a quality of Politoed, a non-Uber, it makes us re-evaluate some things.

Solution... No Longer is Species the only thing that can be 'Uber'

The first solution is obvious, to me. Its not banning all Dream World abilities. Dream World, all things considered, is a good mechanic and most Dream World abilities provide interesting new viable strategies to older Pokemon without them becoming suspect, so we wouldn't want to reduce the amount of viable options. Instead, I've concluded that, in Generation 5, Abilities should and can be Uber in addition to a Species. With most Pokemon now having three Abilities, they have ceased in many cases to be part of a Pokemon's identity, since Abilities now belong in most cases to a wide array of different Species. The same conclusion could be had in the case of a super powerful move given to most Pokemon ... the obvious conclusion is to ban the move, not all the Pokemon who have it (Stealth Rock is one such move that was often debated even in Generation 4 as possibly being a broken agent). This solution really shouldn't even sound farfetched, as Smogon has already set the precedent by banning Moody early into the Generation 5 metagame.

The question becomes, however, if Abilities can be Uber, how can we determine when an Ability is Uber compared to a Species? As I stated before, in the case of most powerful abilities, the reason they do not cause problems is due to the Pokemon they are attached to. When a Pokemon becomes very powerful due to a new ability, however, we have to consider that the Ability is suspect, not the Species. The obvious example here is Blaziken, who received Speed Boost. Logically, we must ask ourselves: “Which is the broken agent, the species Blaziken or the ability Speed Boost?” I maintain that with simple logic, the conclusion is obvious. If you take the ability Speed Boost and place it on most common OU Pokemon, it would result in that Pokemon becoming too powerful. On the other hand, giving Blaziken most other common abilities would not likely cause it to become too powerful. This could be confirmed through testing, by seeing if Blaziken with Blaze as opposed to Speed Boost is Uber. I think the result is obvious, Blaziken would suffer from its Generation 4 weakness, its lack of Speed.

Permaweather is an uber mechanic.
As for Drizzle and Drought, the same kind of argument could be used. Obviously, Politoed and Ninetales are not suspect without Drizzle and Drought... in fact, they were both Neverused in Generation 4. Not much else changed about them. Taking a look at Smogon's current solution to Peraweather (Weather-producing abilities) is laughable... restricting Swift Swim with Drizzle, and banning Excadrill. Sure, one could argue that Swift Swim, Sand Rush, and Chlorophyll (which will now be referred to as the “Weather Speed Boosting abilities”) are broken here, and you could be right. Banning these abilities would likely solve the problem of Generation 5 being too weather-based just as well as banning Permaweather.

However, there are several factors that lead me to believe Permaweather is the broken agent here. First of all, weather teams would not be useless if Permaweather is banned. Permaweather clearly outclasses the alternative, though, the moves Sandstorm, Sunny Day,and Rain Dance, for two reasons: One, the moves require a turn of setup while Permaweather does not, and Two, they have a limited duration (5 turns, or 8 if you use a hold item which sacrifices your ability to use another item). Weather Speed Boosting abilities are completely balanced in the limited duration context, since the extreme power they grant is only temporary. And while other broken weather abilities don't really exist yet, it is conceivable that more are added with future installments. It becomes the choice of whether you'd rather ban several Abilities, or only the Permaweather abilities. Its obvious to me that Permaweather are the broken mechanic here.

Also, since Permaweather abilities clearly outclass the alternative method of setting up weather, and the fact that so few Pokemon have access to Permaweather abilities, it is clear that Permaweather causes more centralization than the Weather Speed Boosting abilities. Because of these, Pokemon who have Permaweather abilities are becoming widely used while most people are forced to bring counters to each of them if they'd like to compete. Also, it centralizes weather teams themselves since being forced to use a single Pokemon limits your ability to use other Pokemon of that type due to type coverage (For example, being forced to use Tyranitar or Hippowdon in Sandstorm restricts you from using other Water or Fighting-weak Pokemon), causing further centralization. There are many more Pokemon with Weather Speed Boosting abilities than Permaweather, so you both create a simpler rule by simply getting rid of Permaweather abilities, and make more strategies viable by reducing centralization. Of course, everything I said in theory above could be verified with Smogon's testing methods. Recall, Sand Stream was a problem in Generation 4 too. Now, ironically thanks to Dream World, all Pokemon with Permaweather abilities now have a different ability available to them, so we aren't banning the Species by banning the Ability.

With the issues I highlighted above, we have fixed a number of weird obscurities with Smogon's current rules. By banning Permaweather and Speed Boost, we could now test Blaziken, Garchomp, Excadrill and possibly even Thundurus back in Overused. Also, we can get rid of the complicated rule disallowing Swift Swim with Drizzle. At the same time, we have reduced the metagame's centralization towards weather teams, likely increasing the amount of viable options in Standard, allowing Pokemon who could not compete with the Weather bias to shine.

Final thoughts, possible problems with my above solution

Some Pokemon, like Yanmega, may suffer as a consequence to this change.
There is one final debate, however, this one mainly having to do with Speed Boost at the moment. If an Ability is deemed to be Uber, is it the Ability or the Ability-Species combination that is Uber? This is important since if Speed Boost becomes Uber, Yanmega and Ninjask are all but useless (Yes, they got new abilities from Dream World, but they aren't nearly as good). While it would be troublesome to set a precedent that says Ability/Species combinations can be Uber since that seems awfully complicated, with Blaziken being the only offender at this point, we could make him the exception to the rule by simply banning Speed Boost on Blaziken. On the other hand, if Speed Boost was given to other Pokemon in the future and it became a problem, we'd have to overwhelmingly decide to simply ban it. So in the long run, it may be simply better to say “Sorry, Yanmega and Ninjask, but Speed Boost is Uber.” After all, if every Pokemon had access to Wonder Guard, we'd obviously have to ban it too, despite its impact on Shedinja.

11 comments:

  1. Permaweather and the Weather Speed Boosting abilities are the heart of OU, which is the core of the metagame. It does seem much like a broken mechanic and centers pokemon around it. Tyranitar, Politoed, Ninetails, (though hail is uncommon ->)Abomasnow, Snover, Hippowdon and Hippopotas all have other abilities, so banning perma-weather is completely realistic. Weather Speed Boosting Abilities will still have the environment to exist due to normal weather moves.

    ReplyDelete
  2. T-tar, Hippo and Abomasnow do not have released other abilities though, as you can not get them in DW yet. Until they are released, banning sand stream is akin to sending hippopotas to ubers, as well as his big brother and T-tar.
    If you allow the use of these DW abilities before release, why not use Shadow Tag Chandelure or TechniLoom as well?

    ReplyDelete
  3. http://www.smogon.com/bw/pokemon/hippowdon
    It says sand force AND talks about it. Smogon clearly doesn't care if all dw abilities aren't released.

    ReplyDelete
  4. Azure Flute never was

    ReplyDelete
  5. That's an item, and they released the Pokemon in a different way. One way or another, I see little reason to believe that all Dream world abilities won't eventually be released.

    And even if Tyranitar and Hippowdon needed to bite the Uber bullet, I still think it would be a positive thing for the meta. By removing two Pokemon, several other Pokemon become viable.

    ReplyDelete
  6. Plus, as DoubleRose stated, Smogon currently has no rule against using unreleased abilities, it seems.

    ReplyDelete
  7. I have a PO account that uses a team of Pokes with Dream World abilities, albeit not all of them, and it's even mono-type too! If perm-weather abilities were banned altogether, my Pokes could easily pull off more wins, because I don't have to worry about those perm-weather threats, especially the mole in the sandstorm. And my Pokes don't use those broken abilities anyway. Volt Absorb and the like are OK.

    ReplyDelete
  8. I completely agree with basically everything being said. It is the permaweather which is completely broken and now almost every team you face abuses weather, and if you dont take a weather team of your own your almost certain to get smashed. If say excadrill didn't have permanent sandstorm behind it, it would be less of a threat and people wouldnt have to bring a check to avoid getting swept.

    ReplyDelete
  9. Wow, this is a really great analysis of the current metagame, I am surprised I haven't realized this sooner.
    I believe that Smogon should attempt to do suspect testing with permaweather banned, and see how the metagame is impacted. I believe a lot of things would become more viable as they were in 4th gen (the good ol' days lolz)
    I also be very happy to see my leftovers actually restoring the HP of my pokes as well xD
    Great article TKN, keep it up!

    ReplyDelete
  10. Permaweather is so simple to handle... as there is Sleep/Freeze/Item/Pokemon clauses, why don't they create a Weather clause? So it becomes clearly optional to fight a weather team.

    But it must be restricted, every server on Pokemon Online, press that "Find Battle" button... you have ~85% chance of fighting a weather team. This clearly makes a ton of pokemon useless because "they are weak in or against Rain/Sun/Sand".

    ReplyDelete