Steve Yegge recently published a manifesto that claims that design patterns and refactoring only bloat Java code to the point of being unmaintainable. He cited his 500k LoC Java game as an example.
He has since decided to rewrite his game in Rhino (Which is a JVM driven version of EcmaScript), attributing the dynamic nature of Ecmascript as superior for keeping the code base small, over a rigorously typed static language such as Java.
I would never argue that Java is not a verbose language full of boilerplate code, I work with .NET, which suffers from the same phenomena.
However, I will argue that Yegge is flawed when he compares total lines of code with both maintainability and complexity.
Comparing snippets of Java or C# to snippets of Ruby/Perl/Python etc are decieving. Sure the dynamic languages may accomplish in one line what a static language takes 5 lines, but the static language is instantly apparent on its purpose, while the dynamic language works by using behind the scenes framework magic.
Besides, if you find yourself repeating that same 5 lines over and over, then its time to apply the DRY principle and refractor it into a method. Now your static language can do exactly what the dynamic language can, and as an added advantage, you can always grok the method to see what exactly is going on. From a maintenance standpoint, this is a huge advantage.
Dynamic languages are not a magic bullet that instantly solves maintenance woes, in fact, I believe they are the complete opposite. One only has to look back 5 years to classic ASP and Dynamic VBScript to see the complete frustration that dynamic development causes on large projects.
A properly designed application using a static language such as Java or C# might easily hit the 500k LoC mark, but with proper design and abstraction, the actual code base can easily be split into manageable 10k LoC assemblies.
At that point you are only maintaining small libraries that reference each other, instead of a super massive black hole of spaghetti code.
I believe that both forms of languages have their purpose, but I do not buy into the recent dynamic language hype. I have been developing in various languages long enough to remember previous dynamic language hype, and the resulting messy applications that came from it.
A final note that I do not believe Yegge touched on (mostly because I didn't read all of his 400 page rant) is that dynamic languages must be developed using test driven methodologies. Applying TDD to static languages is great as well, but the compiler adds an additional level of checks.
What this means in the real world is that a properly designed application will need to have unit tests to ensure that all runtime functionality is carefully checked. The end result is thousands of lines of test code that must be written to replicate the functionality of a compilers type checker*.
* I still advocate TDD when using statically typed languages, you just don't need to do as much.
Thursday, December 27, 2007
Subscribe to:
Post Comments (Atom)
32 comments:
I thought that it was common knowledge that code in dynamic languages was less mantainable than code in the statically typed languages. I believe there's even research papers that shows this (the one I know of used Ruby as a use case). Dynamic languages should not be used for any kind of application that is meant to last, i.e. not expected to be replaced and rewritten any time soon (but still mantained), and not for projects with multiple programmers changing over time.
Hasn't this dead horse been beaten enough?
To larsivi: what kind of ridiculous statement is that? Why would dynamic languages be any worse for maintenance or for projects with multiple programmers over time?
As someone with experience with both static (Java and C++) and dynamic (Ruby) languages, I have not seen any evidence that either is more or less maintainable. With modern tools, unit tests, common conventions and idioms, and good programming practices, code written in dynamic languages is maintainable.
The fact is that dynamic languages like Ruby provide language features that make many of the GoF design patterns obsolete. Features like meta-programming, open classes, and duck typing, for example. You'll still refactor occasionally, but it's usually easier.
You'll still refactor occasionally, but it's usually easier.
Dude, you've drank way too much of the dyna-lang cool-aid. Refactoring Java is dirt simple; every IDE out there has a gazillion refactorings. For the most part, other dyna-langs have lousy tools, and refactoring them is so painful it makes my eyes bleed.
A programmer laments that he has 500K LOC of Java that he doesn't know what to do with. So he decides to switch to a language that has mostly been tested on web pages and small short programs.
An engineer rewrites that 500K LOC of Java into 150K LOC of Java. He keeps his known runtime and can rely on the fact that his language and runtime can adequately run much larger programs.
One way of looking at the story says that rewriting a bloated Java codebase is not sexy.
But switching to a scripting language *is*.
But the real moral of the story is learning to gently nod your head and smile when you see someone who writes an overly verbose "manifesto" to lament a overly bloated codebase.
Maybe one day this person will be an engineer. But, for today, they are just a programmer.
You seem to be assuming that dynamic languages are automatically more concise than statically typed languages, but this isn't necessarily the case.
Consider languages like ML, Scala, and Haskell, by using type inference, it achieves much of the conciseness of a dynamically typed language, with the compile-time type checking benefits of a statically typed language.
To echo "sanity"'s comment above, I hate how the comparisons are static vs. dynamic. Coming from the perspective of someone who has written a lot of Java and Python and Perl, the biggest difference is not type safety. The biggest difference is language/compiler features.
In my opinion, Java is a terrible language (in 2007) not because it is statically typed but because the language/compiler does not support the creation of robust, easy to use libraries, or conveniences like type inference. Scala is a great example of a language that is designed with a lot more productivity in mind, while still being statically typed. (And, in fact, still being JVM-based... so you get compatibility with your old Java code).
ECMAScript 4 adds optional static typing, to an already expressive language (plus proper modules and whatnot to support programming in the large). Using Rhino with those features will certainly make it an appealing alternative to Java on the JVM.
For the record, C# 3.0 has type inference and a host of other "dynamicish" features.
The end result has been more irritating than good at least in my opinion, and will be the subject of a post someday.
Jonathan, I'd love to hear why type inference has been problematic for you. I'm not that familiar with C#'s implementation, although the fact that it was grafted onto the language in version 3.0 concerns me. Its really the type of thing that needs to be in the language from day one for it to be done elegantly.
From some brief research, it seems that C# has rather limited type inference. It only works with local variables, you can't use it for method parameters and elsewhere.
Adding features to pre-existing languages often leads to a mess. Witness the awfulness that is generics in Java, or the huge debate over the various Java closures proposals.
If you haven't played with a language that was designed with type inference from the ground up, you really need to.
IMHO Scala is the most exciting as it compiles to run on the JVM (its almost as fast as Java), and can benefit from the massive number of Java libraries already out there. On the down side, its standard library is still a bit buggy, and IDE plugins aren't really there yet (they exist, but they are also pretty buggy).
Ocaml is also popular, as is Haskell, but Scala is where I'd put my money for wide adoption.
"Sure the dynamic languages may accomplish in one line what a static language takes 5 lines, but the static language is instantly apparent on its purpose, while the dynamic language works by using behind the scenes framework magic."
Static languages have frameworks, and so "behind the scenes framework magic" is not solely the province of dynamic languages. Moreover, one line of near-human-language code that leverages a framework beats five lines of general-purpose code, which in turn clobbers one line of Perl-esque cat-walked-across-the-keyboard syntax for maintenance. None of that is tied to static vs. dynamic either, as there are some badass lines of Java and C++ out there too.
"One only has to look back 5 years to classic ASP and Dynamic VBScript to see the complete frustration that dynamic development causes on large projects."
I was under the impression that the problem there was poor separation of concerns, such as blurred lines between models, views, and controllers. Again, this is not a static vs. dynamic language thing, as you can screw up separation of concerns in any language you like.
"The end result is thousands of lines of test code that must be written to replicate the functionality of a compilers type checker"
This compares well to the thousands of extra lines of test code that must be written for testing static-typed languages, because of code duplication to support the same operations on multiple types. For example, in Java, if you are doing date processing and want to support java.util.Date and java.util.Calendar and java.sql.Date and a third-party date-handling JAR objects as input to your API/object model, you'll either need four separate methods (and 4x the tests), or you'll need one method that takes an untyped object pointer (and still have 4x the tests).
I am not saying that statically-typed languages are bad and dynamically-typed languages are good. However, if you expect anybody to give your thoughts on the subject much credence, you'll need to do a better job.
Scala has been popping up on these dyna vs. static cage matches lately. Scala is probably the coolest language that no one really groks. ;-) Has anyone really tried to do a project with it? I've waded through the (very academic) docs and given it a shot, but it's still a slog for me to get work done with it. The integration with Java is very good, except for some annoyingly verbose adapter contortions (such as working with Java collections in Scala). Be prepared to wade through the scala source code to figure out how to do things. Hopefully, this will change with the upcoming Scala book release.
BTW, Groovy is a very easy dyna-lang for Java coders to pick up (it has both static and dynamic typing)
Honestly, like a lot of folks, I write the majority of my code in Java and despite how verbose some folks tells me it is, somehow I still manage to get my work done.
Dude, you've drank way too much of the dyna-lang cool-aid. Refactoring Java is dirt simple; every IDE out there has a gazillion refactorings. For the most part, other dyna-langs have lousy tools, and refactoring them is so painful it makes my eyes bleed.
You have a point, but not all refactorings in Java are dirt simple and toolable. And not all refactorings are so complicated that a tool is that helpful. The most common refactorings are probably Rename (Class/Method/Variable) and Extract Method -- you can do these pretty easily without a tool in any language.
Refactoring manually isn't all that hard, especially in a dynamic language, when you have an automated test suite and take small steps (which you would with a static language too, I hope).
...not all refactorings in Java are dirt simple and toolable.
OK....would you give an example?
The most common refactorings are probably Rename (Class/Method/Variable) and Extract Method -- you can do these pretty easily without a tool in any language.
That's simply untrue. It's easy to do a search and replace; except that falls apart if you have methods with the same name, overridden methods, or a relatively large code base. And really, who has a test suite that's going to catch every error you introduce in a refactoring?
One complicated refactoring that comes to mind is Replace Conditional with Polymorphism. I don't know of a tool that can do this in one step but it's a pretty common refactoring.
In real life it's usually not that hard to rename a method or a class in a dynamic language. I do it all the time. Name collisions aren't all that common if you use semi-meaningful names, and any collisions are usually pretty obvious at a glance. Sure, it's easier with a tool in a statically-typed language, but it's a pretty trivial manual process.
Extracting a method, which I've found to be extremely common, is a no-brainer in any language, and can be done by a tool.
More complicated refactorings that are assisted by tools in static languages often aren't as complicated or simply aren't necessary in dynamic languages *because* of dynamic typing, not in spite of it. I base this on real world experience. It's counter-intuitive when you're used to the safety net of a compiler and static typing, but good developers working in a good code base just don't have trouble with it.
Jeremy, the Replace Conditional with Polymorphism refactoring is a particularly bad example for comparing refactoring between a statically typed and dynamically typed language.
That particular refactoring relies on if/case statements that check type, so it does not apply at all to dynamic languages.
@jonathan: dynamic typing doesn't mean that objects don't have types. Ruby is a strongly typed language. Refactor Conditional to Polymorphism definitely applies.
Besides, not all conditionals check an object's actual type -- sometimes they check a field that implies a type or sub-type.
I found this article also by Steve Yegge that pretty much sums up how I feel about refactoring, automation, and Ruby. "Ruby is a butterfly."
Oops, forgot the link...
@Jeremy:
Not all dynamic languages are strongly typed. I have not worked in Ruby enough to realize that it was.
However, many dynamic languages are weakly typed, so you wouldn't be able to refractor this way.
Perl:
"2" + 4 = 6
Javascript:
"2" + 4 = "24"
etc.
Regardless of whether the language is weak or strongly-typed, Replace Conditional with Polymorphism can still be a useful refactoring because the conditional doesn't always care about the object's class (type), but the value of some attribute.
Jeremy: Your own link disagrees with you.
The entire concept is to subclass based upon the type of the object. This refactoring has nothing to do with other attributes of the object.
Jonathan, I'm sorry but you're absolutely wrong. Take a closer look at the example on the Refactoring web site. "_type" is an instance variable, not the object's class. In this case the object's "type" is not its class.
I have seen examples in Java where the conditional does check the class using getClass() or instanceof, but there are other conditionals that can be replaced with polymorphism.
Jeremy,
The essence of polymorphism is that it allows you to avoid writing an explicit conditional when you have objects whose behavior varies depending on their types.
class Employee {
public double payAmount() {
return _type.payAmount(this);
}
}
abstract class EmployeeType {
public abstract double payAmount(Employee x);
}
class Engineer extends EmployeeType {
public double payAmount(Employee anEmployee) {
return anEmployee.monthlySalary();
}
}
class Manager extends EmployeeType {
public double payAmount(Employee anEmployee) {
return anEmployee.monthlySalary() * 2;
}
}
class Instructor extends EmployeeType {
public double payAmount(Employee anEmployee) {
return anEmployee.monthlySalary() / 2;
}
}
Jonathan, yes, I know what polymorphism is. I think we were just misunderstanding each other.
In the example, the code with the conditional is using a "_type" attribute (not the object's class) to decide what to do. This can be refactored *to* a solution that uses polymorphism to drive behavior.
My point was that you find conditionals that could be refactored to polymorphism in static and dynamic typed languages as well as strong and weakly typed languages.
Did you really think I didn't know what polymorphism is? ;)
I don't exactly get the point of this post. First you say that static languages are better for large projects than dynamic languages. Then you talk about things that have nothing to do with typing, such as spaghetti code vs well organized abstracted code.
I'm sure Amazon with their massive amount of Perl and Google with its massive amount of Python would argue with you over the fact that you can't create large apps in dynamic languages.
VBScript and ASP weren't frustrating on large projects because VBScript is dynamically typed, it was because of the design of ASP; its failure to properly abstract the layers of the application. You can easily write an ASP.net application, properly abstracted and maintainable, in a dynamic language.
Working with dynamic strong typing, such as Ruby has, just means that you are programming everying as you would in generics in c#. It's strongly typed, you just don't know the type of the object you're using (dynamic) until runtime. Weak typing is a totally different thing.
Dynamic typing also is like having automatic interfaces. You can do everything in c# that you can do in Ruby, but you'll be writing a lot of generics and interfaces to accomplish what is simple in Ruby.
The true power of Ruby comes from it having all the major features that a large app needs (object orientation, namespaces, inheritance, exception handling, etc). Plus things that make very concise code, such as regular expressions, closures (blocks), etc.
Static typing does nothing that makes unit testing unnecessary. Static typing is helpful for the compiler, and you can make the compiled code run faster when the types are static, but it does very little to guarantee that your application does what it's suppose to do; unit tests ensure that.
I really don't understand people's visceral reaction to the growing popularity of dynamic typed languages. Your job as a Java or .net developer isn't' going away soon, so why all the fear?
Personally if you are a serious developer you should have a solid background in a pointer language such as c, static garbage collected languages like Java or c#, dynamic languages such as Python or Ruby, and lastly functional languages such as Scala or Haskell. If you are missing any one of those, you aren't fully groking the full depth of software engineering.
@anonymous, well said!
Personally if you are a serious developer you should have a solid background in a pointer language such as c, static garbage collected languages like Java or c#, dynamic languages such as Python or Ruby, and lastly functional languages such as Scala or Haskell. If you are missing any one of those, you aren't fully groking the full depth of software engineering.
This is a very inward looking point of view and, in my view, is problem in 'software engineering' ; i.e. we tend to focus on our tools and languages too much. Engineering is about solving problems. I think if you are a serious developer it's mostly important to have a solid background in your development domain (e.g. finance, science, embedded, etc.). Most employers don't care if you have a wide breadth of software languages in your toolkit, they care about depth and domain expertise.
I think employers care about depth AND breadth. What good is someone who knows everything there is to know about .NET if they're a Java shop? It's more important to have an aptitude, including the ability the learn new technology, languages, APIs, environments, etc. It's not only better for the employer, but better for the individual as well as they will fit into a greater variety of roles.
Plus you can better appreciate the strengths and weaknesses of a variety of solutions.
@jason: Because dynamic languages makes it possible to leave out contextual information from the code ( we all know that documentation isn't what makes code mantainable) that would otherwise make it possible to see what happens behind the scenes (type conversions in particular). This is in no way ridiculous, it's a fact.
Actually comments do make code maintainable. The best documentation is good code and good code is doable in either static or dynamic languages.
I've done a lot of programming in dynamic lanuages and quite frankly just have not seen what you are saying is such is huge risk. Its the same thing when people argue that its bad (and that you now need generics) that you can put anything into a Java collection (because you have to put an object in there). I might mix a dog in with a list of horses! But in reality nobody does that.
I still think your argument doesn't hold water.
I do not agree ASP and VBScript failure in large project was because of their dynamic nature. Their failure was a result of being use bad technologies like IIS and COM (with MSic approach) not their dynamic aspects.
Just name one project as large as XOOP, Joomla, Drupal, ... which are all developed in PHP and still has big shares!
Thanks
ASP and VBScript are not dynamic languages the same way Python is. VBScript is loosely typed, meaning a var is always "anything". Dynamic languages are that type they were created as and can be found to match that type.
As for type checking at the compiler level, that is good. Great even. However, type checking does not stop all defects from production. Unit tests do. Even then, not all.
Post a Comment